DBA survival BLOG

DBA stuff and Oracle Data Guard

Block Change Tracking and Duplicate: avoid ORA-19755

Posted on June 25, 2015 by Ludovico

If you use Block Change Tracking on your production database and try to duplicate it, you there are good possibilities that you will encounter this error:

ORA-19755: could not open change tracking file
ORA-19750: change tracking file: ‘/u02/oradata/ORCL/ORCL_bct.dbf’
ORA-27037: unable to obtain file status

ORA-19755: could not open change tracking file

ORA-19750: change tracking file: ‘/u02/oradata/ORCL/ORCL_bct.dbf’

ORA-27037: unable to obtain file status

The problem is caused by the block change tracking file entry that exists in the target controlfile, but Oracle can’t find the file because the directory structure on the auxiliary server changes.

After the restore and recovery of the auxiliary database, the duplicate process tries to open the DB but the bct file doesn’t exist and the error is thrown.

If you do a quick google search you will find several workarounds:

disable the block change tracking after you get the error and manually open the auxiliary instance (this prevent the possibility to get the duplicate outcome from the rman return code)
disable the BCT on the target before running the duplicate (this forces your incremental backups to read all your target database!)
Richard Harrison proposed another workaround, you can read more about it here.

There is another workaround that I like more (and that you can also find as comment in Richard’s post):

Disable the Block Change Tracking on the auxiliary while it’s doing the restore/recovery (in mounted status)

(This solutions isn’t coming from me, but as far as I know, the father of this solution is a colleague working at Trivadis.)

You can easily fork a process before running the duplicate command that:

loops and checks the auxiliary instance status
run the disable as soon as the auxiliary is mounted

I’ve worked out this script that does the job:

#!/bin/ksh

if [ $# -ne 1 ] ; then
        echo "Usage: $0 \$ORACLE_SID"
        exit 1
fi
ORACLE_SID=$1
export ORACLE_SID

ORACLE_HOME=`cat /etc/oratab | grep ^$ORACLE_SID | awk -F":" '{print $2}'`
export ORACLE_HOME


disable_trk () {
        echo "DISABLE BLOCK CHANGE TRACKING"
        sqlplus -s / as sysdba << EOF
        set echo on
        col FILENAME for a30
        ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;
        select * from v\$block_change_tracking;
EOF
}

echo "Checking block change tracking for $ORACLE_SID"

PATH=${ORACLE_HOME}/bin:$PATH; export PATH

## for how much time do you want to run the check?
INTERVAL=30
MAXCOUNT=30

# loop until the database status is either MOUNTED or OPEN
COUNT=1
while [ $COUNT -le $MAXCOUNT ] ; do
        STATUS=`sqlplus -s / as sysdba <<EOF
                whenever sqlerror exit SQL.SQLCODE;
                set head off feed off
                select status from v\\$instance;
EOF
`
        if [ $? -eq 0 ] ; then
                if [ $STATUS != "MOUNTED" ] && [ $STATUS != "OPEN" ] ; then
                        echo "$COUNT : Still not mounted"
                        sleep $INTERVAL
                        COUNT=$(($COUNT+1))
                        continue;
                else
                        echo
                        echo "If there is an error, BCT is enabled but Oracle can't find the file (ORA-27037)"
                        echo "This is normal, I'll procede with the disable to avoid the error at the end of the duplicate."
                        sqlplus -s / as sysdba <<EOF 2>/dev/null
                        whenever sqlerror exit SQL.SQLCODE;
                        set head off feed off
                        select status from v\$block_change_tracking where status='ENABLED';
EOF
                        # if i get an error, BCT is enabled but Oracle can't find the file (ORA-27037)
                        if [ $? -eq 0 ] ; then
                                echo "BCT already disabled"
                        else
                                echo "Got the error. Let's disable the BCT!"
                                disable_trk
                        fi
                        break;
                fi
        else
                echo "$COUNT : Still not mounted"
                sleep $INTERVAL
                COUNT=$(($COUNT+1))
                continue;
        fi
done

#!/bin/ksh

if [ $# -ne 1 ] ; then

echo "Usage: $0 \$ORACLE_SID"

exit 1

ORACLE_SID=$1

export ORACLE_SID

ORACLE_HOME=`cat /etc/oratab | grep ^$ORACLE_SID | awk -F":" '{print $2}'`

export ORACLE_HOME

disable_trk () {

echo "DISABLE BLOCK CHANGE TRACKING"

sqlplus -s / as sysdba << EOF

set echo on

col FILENAME for a30

ALTER DATABASE DISABLE BLOCK CHANGE TRACKING;

select * from v\$block_change_tracking;

EOF

}

echo "Checking block change tracking for $ORACLE_SID"

PATH=${ORACLE_HOME}/bin:$PATH; export PATH

## for how much time do you want to run the check?

INTERVAL=30

MAXCOUNT=30

# loop until the database status is either MOUNTED or OPEN

COUNT=1

while [ $COUNT -le $MAXCOUNT ] ; do

STATUS=`sqlplus -s / as sysdba <<EOF

whenever sqlerror exit SQL.SQLCODE;

set head off feed off

select status from v\\$instance;

EOF

if [ $? -eq 0 ] ; then

if [ $STATUS != "MOUNTED" ] && [ $STATUS != "OPEN" ] ; then

echo "$COUNT : Still not mounted"

sleep $INTERVAL

COUNT=$(($COUNT+1))

continue;

else

echo

echo "If there is an error, BCT is enabled but Oracle can't find the file (ORA-27037)"

echo "This is normal, I'll procede with the disable to avoid the error at the end of the duplicate."

sqlplus -s / as sysdba <<EOF 2>/dev/null

whenever sqlerror exit SQL.SQLCODE;

set head off feed off

select status from v\$block_change_tracking where status='ENABLED';

EOF

# if i get an error, BCT is enabled but Oracle can't find the file (ORA-27037)

if [ $? -eq 0 ] ; then

echo "BCT already disabled"

else

echo "Got the error. Let's disable the BCT!"

disable_trk

break;

else

echo "$COUNT : Still not mounted"

sleep $INTERVAL

COUNT=$(($COUNT+1))

continue;

done

Run it just before the duplicate! e.g.

/u01/app/oracle/local/bin/disable_trk_dup.sh $ORACLE_SID &
rman target=... auxiliary=...

1 2	/u01/app/oracle/local/bin/disable_trk_dup.sh $ORACLE_SID & rman target=... auxiliary=...

HTH

—

Ludovico

Get the last database backup for all databases in a SQL Server instance

Posted on February 26, 2015 by Ludovico

I don’t like to publish small code snippets, but I’ve just rewritten one of my most used SQL scripts for SQL Server that gets the details about the last backup for every database (for both backup database and backup log).

It now makes use of with () and rank() over() to make it much easier to read and modify.

So I think it’s worth to share it.

with
db as (
	select [Instance] = @@SERVERNAME,
		[Database]  = name,
	    [RecoveryMode]   = DATABASEPROPERTYEX(name, 'Recovery'),
	    [CreationTime]   = crdate,
	    [Status]         = DATABASEPROPERTYEX(name, 'Status')
		from master..sysdatabases
		where name!='tempdb'
),
lastfull as	(
	select * from (
	 select [Database]     = s.database_name,
		--[Type]   = s.type,
		[LastFullDate] = convert(varchar, s.backup_finish_date, 120),
		[LastFullSize]   = s.backup_size,
		[LastFullDevice] = f.physical_device_name,
        [LastFullDevTyp] = f.device_type,
		[Nrank] = rank() over (partition by s.database_name order by s.backup_finish_date desc)
	from msdb.dbo.backupset s, msdb.dbo.backupmediafamily f
		where 
         s.media_set_id=f.media_set_id
		and s.type='D'
		-- and f.device_type = 7 -- only backup devices
	) f
	where nrank=1
),
lastlog as (
	select * from (
	 select [Database]     = s.database_name,
		--[Type]   = s.type,
		[LastLogDate] = convert(varchar, s.backup_finish_date, 120),
		[LastLogSize]   = s.backup_size,
		[LastLogDevice] = f.physical_device_name,
        [LastLogDevTyp] = f.device_type,
		[Nrank] = rank() over (partition by s.database_name order by s.backup_finish_date desc)
	 from msdb.dbo.backupset s, msdb.dbo.backupmediafamily f
		where 
         s.media_set_id=f.media_set_id
		and s.type='L'
		-- and f.device_type = 7 -- only backup devices
	) l
	where nrank=1
)
select db.[Instance],db.[Database], db.[RecoveryMode], db.[CreationTime], db.[Status],
	lastfull.[LastFullDate], lastfull.[LastFullSize], 
        lastfull.[LastFullDevice], lastfull.[LastFullDevTyp],
	lastlog.[LastLogDate], lastlog.[LastLogSize], lastlog.[LastLogDevice], lastlog.[LastLogDevTyp]
from db
	left outer join lastfull
		on (db.[Database]=lastfull.[Database])
	left outer join lastlog
		on (db.[Database]=lastlog.[Database])

with

db as (

select [Instance] = @@SERVERNAME,

[Database] = name,

[RecoveryMode] = DATABASEPROPERTYEX(name, 'Recovery'),

[CreationTime] = crdate,

[Status] = DATABASEPROPERTYEX(name, 'Status')

from master..sysdatabases

where name!='tempdb'

lastfull as (

select * from (

select [Database] = s.database_name,

--[Type] = s.type,

[LastFullDate] = convert(varchar, s.backup_finish_date, 120),

[LastFullSize] = s.backup_size,

[LastFullDevice] = f.physical_device_name,

[LastFullDevTyp] = f.device_type,

[Nrank] = rank() over (partition by s.database_name order by s.backup_finish_date desc)

from msdb.dbo.backupset s, msdb.dbo.backupmediafamily f

where

s.media_set_id=f.media_set_id

and s.type='D'

-- and f.device_type = 7 -- only backup devices

) f

where nrank=1

lastlog as (

select * from (

select [Database] = s.database_name,

--[Type] = s.type,

[LastLogDate] = convert(varchar, s.backup_finish_date, 120),

[LastLogSize] = s.backup_size,

[LastLogDevice] = f.physical_device_name,

[LastLogDevTyp] = f.device_type,

[Nrank] = rank() over (partition by s.database_name order by s.backup_finish_date desc)

from msdb.dbo.backupset s, msdb.dbo.backupmediafamily f

where

s.media_set_id=f.media_set_id

and s.type='L'

-- and f.device_type = 7 -- only backup devices

) l

where nrank=1

)

select db.[Instance],db.[Database], db.[RecoveryMode], db.[CreationTime], db.[Status],

lastfull.[LastFullDate], lastfull.[LastFullSize],

lastfull.[LastFullDevice], lastfull.[LastFullDevTyp],

lastlog.[LastLogDate], lastlog.[LastLogSize], lastlog.[LastLogDevice], lastlog.[LastLogDevTyp]

from db

left outer join lastfull

on (db.[Database]=lastfull.[Database])

left outer join lastlog

on (db.[Database]=lastlog.[Database])

As you can see, modify it to include for example incremental backups should be very easy.

Cheers
—
Ludo

Cloning a PDB with ASM and Data Guard (no ADG) without network transfer

Posted on December 16, 2014 by Ludovico

Ok, if you’re reading this post, you may want to read also the previous one that explains something more about the problem.

Briefly said, if you have a CDB running on ASM in a MAA architecture and you do not have Active Data Guard, when you clone a PDB you have to “copy” the datafiles somehow on the standby. The only solution offered by Oracle (in a MOS Note, not in the documentation) is to restore the PDB from the primary to the standby site, thus transferring it over the network. But if you have a huge PDB this is a bad solution because it impacts your network connectivity. (Note: ending up with a huge PDB IMHO can only be caused by bad consolidation. I do not recommend to consolidate huge databases on Multitenant).

So I’ve worked out another solution, that still has many defects and is almost not viable, but it’s technically interesting because it permits to discover a little more about Multitenant and Data Guard.

The three options

At the primary site, the process is always the same: Oracle copies the datafiles of the source, and it modifies the headers so that they can be used by the new PDB (so it changes CON_ID, DBID, FILE#, and so on).

On the standby site, by opposite, it changes depending on the option you choose:

Option 1: Active Data Guard

If you have ADG, the ADG itself will take care of copying the datafile on the standby site, from the source standby pdb to the destination standby pdb. Once the copy is done, the MRP0 will continue the recovery. The modification of the header block of the destination PDB is done by the MRP0 immediately after the copy (at least this is what I understand).

ADG_PDB_copy

Option 2: No Active Data Guard, but STANDBYS=none

In this case, the copy on the standby site doesn’t happen, and the recovery process just add the entry of the new datafiles in the controlfile, with status OFFLINE and name UNKNOWNxxx. However, the source file cannot be copied anymore, because the MRP0 process will expect to have a copy of the destination datafile, not the source datafile. Also, any tentative of restore of the datafile 28 (in this example) will give an error because it does not belong to the destination PDB. So the only chance is to restore the destination PDB from the primary.

Option 3: No Active Data Guard, no STANDBYS=none

This is the case that I want to explain actually. Without the flag STANDBYS=none, the MRP0 process will expect to change the header of the new datafile, but because the file does not exist yet, the recovery process dies.
We can then copy it manually from the source standby pdb, and restart the recovery process, that will change the header. This process needs to be repeated for each datafile. (that’s why it’s not a viable solution, right now).

Let’s try it together:

The Environment

Primary

08:13:08 SYS@CDBATL_2> select db_unique_name, instance_name from v$database, gv$instance;

DB_UNIQUE_NAME                 INSTANCE_NAME
------------------------------ ----------------
CDBATL                         CDBATL_2
CDBATL                         CDBATL_1

08:13:08 SYS@CDBATL_2> select db_unique_name, instance_name from v$database, gv$instance;

DB_UNIQUE_NAME INSTANCE_NAME

------------------------------ ----------------

CDBATL CDBATL_2

CDBATL CDBATL_1

Standby

07:35:56 SYS@CDBGVA_2> select db_unique_name, instance_name from v$database, gv$instance;

DB_UNIQUE_NAME                 INSTANCE_NAME
------------------------------ ----------------
CDBGVA                         CDBGVA_1
CDBGVA                         CDBGVA_2

07:35:56 SYS@CDBGVA_2> select db_unique_name, instance_name from v$database, gv$instance;

DB_UNIQUE_NAME INSTANCE_NAME

------------------------------ ----------------

CDBGVA CDBGVA_1

CDBGVA CDBGVA_2

The current user PDB (any resemblance to real people is purely coincidental 😉 #haveUSeenMaaz):

08:14:31 SYS@CDBATL_2> select open_mode, name from gv$pdbs where name='MAAZ';

OPEN_MODE  NAME
---------- ------------------------------
OPEN       MAAZ
OPEN       MAAZ

08:14:31 SYS@CDBATL_2> select open_mode, name from gv$pdbs where name='MAAZ';

OPEN_MODE NAME

---------- ------------------------------

OPEN MAAZ

Cloning the PDB on the primary

First, make sure that the source PDB is open read-only

08:45:54 SYS@CDBATL_2> alter pluggable database maaz close immediate instances=all;

Pluggable database altered.

08:46:20 SYS@CDBATL_2> alter pluggable database maaz open read only instances=all;

Pluggable database altered.

08:46:32 SYS@CDBATL_2> select open_mode, name from gv$pdbs where name='MAAZ' ;

OPEN_MODE  NAME
---------- ------------------------------
READ ONLY  MAAZ
READ ONLY  MAAZ

08:45:54 SYS@CDBATL_2> alter pluggable database maaz close immediate instances=all;

Pluggable database altered.

08:46:20 SYS@CDBATL_2> alter pluggable database maaz open read only instances=all;

Pluggable database altered.

08:46:32 SYS@CDBATL_2> select open_mode, name from gv$pdbs where name='MAAZ' ;

OPEN_MODE NAME

---------- ------------------------------

READ ONLY MAAZ

Then, clone the PDB on the primary without the clause STANDBYS=NONE:

08:46:41 SYS@CDBATL_2> create pluggable database LUDO from MAAZ;

Pluggable database created.

08:46:41 SYS@CDBATL_2> create pluggable database LUDO from MAAZ;

Pluggable database created.

Review the clone on the Standby

At this point, on the standby the alert log show that the SYSTEM datafile is missing, and the recovery process stops.

Mon Dec 15 17:46:11 2014
Recovery created pluggable database LUDO
Mon Dec 15 17:46:11 2014
Errors in file /u01/app/oracle/diag/rdbms/cdbgva/CDBGVA_2/trace/CDBGVA_2_mrp0_16464.trc:
ORA-01565: error in identifying file '+DATA'
ORA-17503: ksfdopn:2 Failed to open file +DATA
ORA-15045: ASM file name '+DATA' is not in reference form
Recovery was unable to create the file as:
'+DATA'
MRP0: Background Media Recovery terminated with error 1274
Mon Dec 15 17:46:11 2014
Errors in file /u01/app/oracle/diag/rdbms/cdbgva/CDBGVA_2/trace/CDBGVA_2_mrp0_16464.trc:
ORA-01274: cannot add data file that was originally created as '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765'
Mon Dec 15 17:46:11 2014
Managed Standby Recovery not using Real Time Apply
Mon Dec 15 17:46:11 2014
Recovery interrupted!
Recovery stopped due to failure in applying recovery marker (opcode 17.34).
Datafiles are recovered to a consistent state at change 10433175 but controlfile could be ahead of datafiles.
Mon Dec 15 17:46:11 2014
Errors in file /u01/app/oracle/diag/rdbms/cdbgva/CDBGVA_2/trace/CDBGVA_2_mrp0_16464.trc:
ORA-01274: cannot add data file that was originally created as '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765'
Mon Dec 15 17:46:11 2014
MRP0: Background Media Recovery process shutdown (CDBGVA_2)

Mon Dec 15 17:46:11 2014

Recovery created pluggable database LUDO

Mon Dec 15 17:46:11 2014

Errors in file /u01/app/oracle/diag/rdbms/cdbgva/CDBGVA_2/trace/CDBGVA_2_mrp0_16464.trc:

ORA-01565: error in identifying file '+DATA'

ORA-17503: ksfdopn:2 Failed to open file +DATA

ORA-15045: ASM file name '+DATA' is not in reference form

Recovery was unable to create the file as:

'+DATA'

MRP0: Background Media Recovery terminated with error 1274

Mon Dec 15 17:46:11 2014

Errors in file /u01/app/oracle/diag/rdbms/cdbgva/CDBGVA_2/trace/CDBGVA_2_mrp0_16464.trc:

ORA-01274: cannot add data file that was originally created as '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765'

Mon Dec 15 17:46:11 2014

Managed Standby Recovery not using Real Time Apply

Mon Dec 15 17:46:11 2014

Recovery interrupted!

Recovery stopped due to failure in applying recovery marker (opcode 17.34).

Datafiles are recovered to a consistent state at change 10433175 but controlfile could be ahead of datafiles.

Mon Dec 15 17:46:11 2014

Errors in file /u01/app/oracle/diag/rdbms/cdbgva/CDBGVA_2/trace/CDBGVA_2_mrp0_16464.trc:

ORA-01274: cannot add data file that was originally created as '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765'

Mon Dec 15 17:46:11 2014

MRP0: Background Media Recovery process shutdown (CDBGVA_2)

One remarkable thing, is that in the standby controlfile, ONLY THE SYSTEM DATAFILE exists:

18:02:50 SYS@CDBGVA_2> select con_id from v$pdbs where name='LUDO';

    CON_ID
----------
         4

18:03:10 SYS@CDBGVA_2> select name from v$datafile where con_id=4;

NAME
---------------------------------------------------------------------------
+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765

18:02:50 SYS@CDBGVA_2> select con_id from v$pdbs where name='LUDO';

CON_ID

----------

18:03:10 SYS@CDBGVA_2> select name from v$datafile where con_id=4;

NAME

---------------------------------------------------------------------------

+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765

We need to fix the datafiles one by one, but most of the steps can be done once for all the datafiles.

Copy the source PDB from the standby

What do we need to do? Well, the recovery process is stopped, so we can safely copy the datafiles of the source PDB from the standby site because they have not moved yet. (meanwhile, we can put the primary source PDB back in read-write mode).

-- on primary
08:58:07 SYS@CDBATL_2> alter pluggable database maaz close immediate instances=all;

Pluggable database altered.

08:58:15 SYS@CDBATL_2> alter pluggable database maaz open read write instances=all;

Pluggable database altered.

-- on primary

08:58:07 SYS@CDBATL_2> alter pluggable database maaz close immediate instances=all;

Pluggable database altered.

08:58:15 SYS@CDBATL_2> alter pluggable database maaz open read write instances=all;

Pluggable database altered.

Copy the datafiles:

## on the standby:
RMAN> backup as copy pluggable database MAAZ;

Starting backup at 15-DEC-14
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=58 instance=CDBGVA_2 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00029 name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.463.857404625
output file name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043 tag=TAG20141215T175041 RECID=54 STAMP=866397046
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00028 name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.283.857404623
output file name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049 tag=TAG20141215T175041 RECID=55 STAMP=866397051
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
Finished backup at 15-DEC-14

Starting Control File and SPFILE Autobackup at 15-DEC-14
piece handle=+DATA/CDBGVA/AUTOBACKUP/2014_12_15/s_866396771.865.866397053 comment=NONE
Finished Control File and SPFILE Autobackup at 15-DEC-14

## on the standby:

RMAN> backup as copy pluggable database MAAZ;

Starting backup at 15-DEC-14

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=58 instance=CDBGVA_2 device type=DISK

channel ORA_DISK_1: starting datafile copy

input datafile file number=00029 name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.463.857404625

output file name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043 tag=TAG20141215T175041 RECID=54 STAMP=866397046

channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07

channel ORA_DISK_1: starting datafile copy

input datafile file number=00028 name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.283.857404623

output file name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049 tag=TAG20141215T175041 RECID=55 STAMP=866397051

channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03

Finished backup at 15-DEC-14

Starting Control File and SPFILE Autobackup at 15-DEC-14

piece handle=+DATA/CDBGVA/AUTOBACKUP/2014_12_15/s_866396771.865.866397053 comment=NONE

Finished Control File and SPFILE Autobackup at 15-DEC-14

Do the magic

Now there’s the interesting part: we need to assign the datafile copies of the maaz PDB to LUDO.

Sadly, the OMF will create the copies on the bad location (it’s a copy, to they are created on the same location as the source PDB).

We cannot try to uncatalog and recatalog the copies, because they will ALWAYS be affected to the source PDB. Neither we can use RMAN because it will never associate the datafile copies to the new PDB. We need to rename the files manually.

RMAN> list datafilecopy all;

List of Datafile Copies
=======================

Key File S Completion Time Ckp SCN Ckp Time
------- ---- - --------------- ---------- ---------------
55 28 A 15-DEC-14 10295232 14-DEC-14
 Name: +DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.86639709
 Tag: TAG20141215T175041

54 29 A 15-DEC-14 10295232 14-DEC-14
 Name: +DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.86639703
 Tag: TAG20141215T175041


RMAN> select name, guid from v$pdbs;

NAME       GUID
---------- --------------------------------
PDB$SEED   FFBCECBB503D606BE043334EA8C019B7
MAAZ       0243BF7B39D4440AE053334EA8C0E471
LUDO       0A4A0048D5321597E053334EA8C0E40A

RMAN> list datafilecopy all;

List of Datafile Copies

=======================

Key File S Completion Time Ckp SCN Ckp Time

------- ---- - --------------- ---------- ---------------

55 28 A 15-DEC-14 10295232 14-DEC-14

Name: +DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.86639709

Tag: TAG20141215T175041

54 29 A 15-DEC-14 10295232 14-DEC-14

Name: +DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.86639703

Tag: TAG20141215T175041

RMAN> select name, guid from v$pdbs;

NAME GUID

---------- --------------------------------

PDB$SEED FFBCECBB503D606BE043334EA8C019B7

MAAZ 0243BF7B39D4440AE053334EA8C0E471

LUDO 0A4A0048D5321597E053334EA8C0E40A

It’s better to uncatalog the datafile copies before, so we keep the catalog clean:

RMAN> change datafilecopy '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049' uncatalog;

uncataloged datafile copy
datafile copy file name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049 RECID=55 STAMP=866397051
Uncataloged 1 objects


RMAN> change datafilecopy '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043' uncatalog;

uncataloged datafile copy
datafile copy file name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043 RECID=54 STAMP=866397046
Uncataloged 1 objects

RMAN> change datafilecopy '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049' uncatalog;

uncataloged datafile copy

datafile copy file name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049 RECID=55 STAMP=866397051

Uncataloged 1 objects

RMAN> change datafilecopy '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043' uncatalog;

uncataloged datafile copy

datafile copy file name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043 RECID=54 STAMP=866397046

Uncataloged 1 objects

Then, because we cannot rename files on a standby database with standby file management set to AUTO, we need to put it temporarily to MANUAL.

10:24:21 SYS@CDBGVA_2> alter database rename file '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765' to '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049';
alter database rename file '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765' to '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049'
*
ERROR at line 1:
ORA-01275: Operation RENAME is not allowed if standby file management is automatic.

10:24:21 SYS@CDBGVA_2> alter database rename file '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765' to '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049';

alter database rename file '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765' to '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049'

ERROR at line 1:

ORA-01275: Operation RENAME is not allowed if standby file management is automatic.

10:27:49 SYS@CDBGVA_2> select name, ispdb_modifiable from v$parameter where name like 'standby%';

NAME                                                         ISPDB
------------------------------------------------------------ -----
standby_archive_dest                                         FALSE
standby_file_management                                      FALSE

10:27:49 SYS@CDBGVA_2> select name, ispdb_modifiable from v$parameter where name like 'standby%';

NAME ISPDB

------------------------------------------------------------ -----

standby_archive_dest FALSE

standby_file_management FALSE

standby_file_management is not PDB modifiable, so we need to do it for the whole CDB.

10:31:42 SYS@CDBGVA_2> alter system set standby_file_management=manual;

System altered.

18:05:04 SYS@CDBGVA_2> alter database rename file '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765' to '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049';

Database altered.

10:31:42 SYS@CDBGVA_2> alter system set standby_file_management=manual;

System altered.

18:05:04 SYS@CDBGVA_2> alter database rename file '+DATA/CDBATL/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.825.866396765' to '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049';

Database altered.

then we need to set back the standby_file_management=auto or the recover will not start:

10:34:24 SYS@CDBGVA_2> alter system set standby_file_management=auto;
System altered.

1 2	10:34:24 SYS@CDBGVA_2> alter system set standby_file_management=auto; System altered.

We can now restart the recovery.

The recovery process will:
– change the new datafile by modifying the header for the new PDB
– create the entry for the second datafile in the controlfile
– crash again because the datafile is missing

18:11:30 SYS@CDBGVA_2> alter database recover managed standby database;
alter database recover managed standby database
*
ERROR at line 1:
ORA-00283: recovery session canceled due to errors
ORA-01111: name for data file 61 is unknown - rename to correct file
ORA-01110: data file 61: '/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/UNNAMED00061'
ORA-01157: cannot identify/lock data file 61 - see DBWR trace file
ORA-01111: name for data file 61 is unknown - rename to correct file
ORA-01110: data file 61: '/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/UNNAMED00061'


18:11:33 SYS@CDBGVA_2> select name from v$datafile where con_id=4;

NAME
---------------------------------------------------------------------------
+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049
/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/UNNAMED00061

18:11:30 SYS@CDBGVA_2> alter database recover managed standby database;

alter database recover managed standby database

ERROR at line 1:

ORA-00283: recovery session canceled due to errors

ORA-01111: name for data file 61 is unknown - rename to correct file

ORA-01110: data file 61: '/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/UNNAMED00061'

ORA-01157: cannot identify/lock data file 61 - see DBWR trace file

ORA-01111: name for data file 61 is unknown - rename to correct file

ORA-01110: data file 61: '/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/UNNAMED00061'

18:11:33 SYS@CDBGVA_2> select name from v$datafile where con_id=4;

NAME

---------------------------------------------------------------------------

+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049

/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/UNNAMED00061

We already have the SYSAUX datafile, right? So we can alter the name again:

18:14:21 SYS@CDBGVA_2> alter system set standby_file_management=manual;

System altered.

18:14:29 SYS@CDBGVA_2> alter database rename file '/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/UNNAMED00061' to '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043';

Database altered.

18:14:31 SYS@CDBGVA_2> alter system set standby_file_management=auto;

System altered.

18:14:35 SYS@CDBGVA_2> alter database recover managed standby database;

18:14:21 SYS@CDBGVA_2> alter system set standby_file_management=manual;

System altered.

18:14:29 SYS@CDBGVA_2> alter database rename file '/u01/app/oracle/product/12.1.0.2/dbhome_1/dbs/UNNAMED00061' to '+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043';

Database altered.

18:14:31 SYS@CDBGVA_2> alter system set standby_file_management=auto;

System altered.

18:14:35 SYS@CDBGVA_2> alter database recover managed standby database;

This time all the datafiles have been copied (no user datafile for this example) and the recovery process will continue!! 🙂 so we can hit ^C and start it in background.

18:14:35 SYS@CDBGVA_2> alter database recover managed standby database;
alter database recover managed standby database
*
ERROR at line 1:
ORA-16043: Redo apply has been canceled.
ORA-01013: user requested cancel of current operation

 

18:18:10 SYS@CDBGVA_2> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;

Database altered.

18:18:19 SYS@CDBGVA_2>

18:14:35 SYS@CDBGVA_2> alter database recover managed standby database;

alter database recover managed standby database

ERROR at line 1:

ORA-16043: Redo apply has been canceled.

ORA-01013: user requested cancel of current operation

18:18:10 SYS@CDBGVA_2> ALTER DATABASE RECOVER MANAGED STANDBY DATABASE DISCONNECT;

Database altered.

18:18:19 SYS@CDBGVA_2>

The Data Guard configuration reflects the success of this operation.

Do we miss anything?

Of course, we do!! The datafile names of the new PDB reside in the wrong ASM path. We need to fix them!

18:23:07 SYS@CDBGVA_2> alter database recover managed standby database cancel;

Database altered.

RMAN> backup as copy pluggable database ludo;

Starting backup at 15-DEC-14
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=60 instance=CDBGVA_2 device type=DISK
channel ORA_DISK_1: starting datafile copy
input datafile file number=00061 name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043
output file name=+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/sysaux.866.866398933 tag=TAG20141215T182213 RECID=56 STAMP=866398937
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07
channel ORA_DISK_1: starting datafile copy
input datafile file number=00060 name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049
output file name=+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.867.866398941 tag=TAG20141215T182213 RECID=57 STAMP=866398943
channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03
Finished backup at 15-DEC-14

Starting Control File and SPFILE Autobackup at 15-DEC-14
piece handle=+DATA/CDBGVA/AUTOBACKUP/2014_12_15/s_866398689.868.866398945 comment=NONE
Finished Control File and SPFILE Autobackup at 15-DEC-14

RMAN> switch pluggable database ludo to copy;

using target database control file instead of recovery catalog
datafile 60 switched to datafile copy "+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.867.866398941"
datafile 61 switched to datafile copy "+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/sysaux.866.866398933"

18:23:54 SYS@CDBGVA_2> select name from v$datafile where con_id=4;

NAME
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.867.866398941
+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/sysaux.866.866398933

18:23:07 SYS@CDBGVA_2> alter database recover managed standby database cancel;

Database altered.

RMAN> backup as copy pluggable database ludo;

Starting backup at 15-DEC-14

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=60 instance=CDBGVA_2 device type=DISK

channel ORA_DISK_1: starting datafile copy

input datafile file number=00061 name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/sysaux.863.866397043

output file name=+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/sysaux.866.866398933 tag=TAG20141215T182213 RECID=56 STAMP=866398937

channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:07

channel ORA_DISK_1: starting datafile copy

input datafile file number=00060 name=+DATA/CDBGVA/0243BF7B39D4440AE053334EA8C0E471/DATAFILE/system.864.866397049

output file name=+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.867.866398941 tag=TAG20141215T182213 RECID=57 STAMP=866398943

channel ORA_DISK_1: datafile copy complete, elapsed time: 00:00:03

Finished backup at 15-DEC-14

Starting Control File and SPFILE Autobackup at 15-DEC-14

piece handle=+DATA/CDBGVA/AUTOBACKUP/2014_12_15/s_866398689.868.866398945 comment=NONE

Finished Control File and SPFILE Autobackup at 15-DEC-14

RMAN> switch pluggable database ludo to copy;

using target database control file instead of recovery catalog

datafile 60 switched to datafile copy "+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.867.866398941"

datafile 61 switched to datafile copy "+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/sysaux.866.866398933"

18:23:54 SYS@CDBGVA_2> select name from v$datafile where con_id=4;

NAME

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/system.867.866398941

+DATA/CDBGVA/0A4A0048D5321597E053334EA8C0E40A/DATAFILE/sysaux.866.866398933

I know there’s no practical use of this procedure, but it helps a lot in understanding how Multitenant has been implemented.

I expect some improvements in 12.2!!

Cheers

—

Ludo

Oracle Database Backup Logging Recovery Appliance – a preview

Posted on May 3, 2014 by Ludovico

Please see the disclaimer at the end of the post.

Oracle has announced the new Oracle Database Backup Logging Recovery Appliance at the last Open World 2013, but since then it has not been released to the market yet, and very few information is available on the Oracle website.

During the last IOUG Collaborate 14, Oracle master product manager of Data Guard and MAA, Larry Carpenter, has unveiled something more about the DBRLA (call it “Debra” to simplify your life 🙂 ) , and I’ve had the chance to discuss about it directly with Larry.

At Trivadis we think that this appliance will be a game changer in the world of backup management.

Why?

Well, if you have ever worked for a big company with many hundreds of databases, you have certainly encountered many of those common problems:

Oracle Backup and restore penalized by a shared infrastructure
Poor backup or restore performance
Tape drives busy when you need them urgently
Complex management of backup retentions

That’s not all. As of now, your best recovery point in case of restore is directly related to your backup archive frequency. Oh yes, you have to low down your archive_lag_target parameter, increase your log switch frequency (and thus, the I/O) and still have… 10, 15, 30 minutes of possible data loss? Unless you protect your transactions with a Data Guard. But this will cost you money. For the additional server and storage. For the licenses. And for the effort required to put in place a Data Guard instance for every database that you want to protect. You want to protect your transactions from a storage failure and there’s a price to pay.

The Database Backup Logging Recovery Appliance (wow, I need to copy and paste the name to save time! :-)) overcomes these problems with a simple but brilliant idea: leveraging the existing redo log transport processes and ship the redo stream directly to the backup appliance (the DBLRA, off course) or to its Cloud alter ego, hosted by Oracle.

As you can infer from the picture, 12c databases will work natively with the appliance, while previous releases will have a plugin that will enable all the capabilities.

Backups can be mirrored selectively to another DBLRA, or copied to the cloud or to a 3rd party (Virtual) Tape Library.

The backup retention is enforced by the appliance and the expiration and deletion is done automatically using the embedded RMAN catalog.

Lightning fast backups and restores are guaranteed by the hardware: DBLRA is based on the same hardware used by Exadata, with High Capacity disks. Optional storage extensions can be added to increase the capacity, but all the data, as I’ve said, can be offloaded to VTLs in order to use a cheaper storage for older backups.

To resume, the key values are:

No transaction loss!!
Lightning fast backups and restores
Integrated, Oracle engineered, scalable solution for hundreds to thousands of databases

Looking forward to see it in action!

I cannot cover all the information I have in a single post, but Trivadis is working actively to be ready to implement it at the time of the launch to the market (estimated in 2014), so feel free to contact me if you are interested in boosting your backup environment. 😉

By the way, I expect that the competitors (IBM, Microsoft?) will try to develop a solution with the same characteristics in terms of reliability, or they will lose terrain.

Cheers!

Ludovico

Disclaimer: This post is intended to outline Oracle’s general product direction based on the information gathered through public conferences. It is intended for informational purposes only. The development and release of these functionalities and features including the release dates remain at the sole discretion of Oracle and no documentation is available at this time. The features and commands shown may or may not be accurate when the final product release goes GA (General Availability).
Please refer Oracle documentation when it becomes available.

SQLServer centralized backup monitoring with PowerShell and TSQL (2/2)

Posted on May 6, 2013 by Ludovico

In my previous post I’ve shown how to collect data and insert it into a database table using PowerShell. Now it’s time to get some information from that data, and I’ve used TSQL for this purpose.

The backup exceptions

Every environment has some backup rules and backup exceptions. For example, you don’t want to check for failures on the model, northwind, adventureworks or distribution databases.

I’ve got the rid of this problem by using the “exception table” created in the previous post. The rules are defined by pattern matching. First we need to define a generic rule for our default backup schedules:

INSERT INTO [Tools].[dbo].[DB_Backup_Exceptions]
           ([InstanceName],[DatabaseName],[LastFullHours]
           ,[LastLogHours],[Description],[BestBefore])
     VALUES
           ('%','%',36,12,'Default delays for all databases',NULL)
GO

INSERT INTO [Tools].[dbo].[DB_Backup_Exceptions]

([InstanceName],[DatabaseName],[LastFullHours]

,[LastLogHours],[Description],[BestBefore])

VALUES

('%','%',36,12,'Default delays for all databases',NULL)

In the previous example, we’ll check that all databases (‘%’) on all instances (again ‘%’) have been backed up at least every 36 hours, and a backup log have occurred in the last 12 hours. The description is useful to remember why such rule exists.

The “BestBefore” column allows to define the time limit of the rule. For example, if you do some maintenance and you are skipping some schedules, you can safely insert a rule that expires after X days, so you can avoid some alerts while avoiding also to forget to delete the rule.

INSERT INTO [Tools].[dbo].[DB_Backup_Exceptions]
           ([InstanceName],[DatabaseName],[LastFullHours]
           ,[LastLogHours],[Description],[BestBefore])
     VALUES
           ('SERVER1','%',1000000,1000000,'Maintenance until 012.05.2013', '2013-05-12 00:00:00')
GO

INSERT INTO [Tools].[dbo].[DB_Backup_Exceptions]

([InstanceName],[DatabaseName],[LastFullHours]

,[LastLogHours],[Description],[BestBefore])

VALUES

('SERVER1','%',1000000,1000000,'Maintenance until 012.05.2013', '2013-05-12 00:00:00')

The previous rule will skip backup reports on SERVER1 until May 12th.

INSERT INTO [Tools].[dbo].[DB_Backup_Exceptions]
           ([InstanceName],[DatabaseName],[LastFullHours]
           ,[LastLogHours],[Description],[BestBefore])
     VALUES
           ('%','Northwind',1000000,1000000,'Don''t care about northwind',NULL)
GO

INSERT INTO [Tools].[dbo].[DB_Backup_Exceptions]

([InstanceName],[DatabaseName],[LastFullHours]

,[LastLogHours],[Description],[BestBefore])

VALUES

('%','Northwind',1000000,1000000,'Don''t care about northwind',NULL)

The previous rule will skip all reports on all Northwind databases.

Important: If multiple rules apply to the same database, the rule with a higher time threshold wins.

The queries

The following query lists the databases with the last backup full older than the defined threshold:

select     s.InstanceName,    s.DatabaseName,
        s.LastFull,
        floor(convert(float,GETDATE()-LastFull)*24) as LastFullAge
from db_status s, DB_Backup_Exceptions e
    where s.InstanceName like e.InstanceName and s.DatabaseName like e.DatabaseName
    and (e.bestbefore is null or e.bestbefore > GETDATE())
    and s.DatabaseStatus!='OFFLINE'
group by s.InstanceName, s.DatabaseName, s.LastFull
having LastFull < GETDATE()-(max(e.LastFullHours)/24.0)
order by s.InstanceName, s.DatabaseName
GO

select s.InstanceName, s.DatabaseName,

s.LastFull,

floor(convert(float,GETDATE()-LastFull)*24) as LastFullAge

from db_status s, DB_Backup_Exceptions e

where s.InstanceName like e.InstanceName and s.DatabaseName like e.DatabaseName

and (e.bestbefore is null or e.bestbefore > GETDATE())

and s.DatabaseStatus!='OFFLINE'

group by s.InstanceName, s.DatabaseName, s.LastFull

having LastFull < GETDATE()-(max(e.LastFullHours)/24.0)

order by s.InstanceName, s.DatabaseName

And the following will do the same for the transaction logs:

select     s.InstanceName, s.DatabaseName,
         s.LastLog,
         floor(convert(float,GETDATE()-LastLog)*24)  as LastLogAge
    from db_status s, DB_Backup_Exceptions e
        where s.InstanceName like e.InstanceName and s.DatabaseName like e.DatabaseName
    and (e.bestbefore is null or e.bestbefore > GETDATE())
    and s.DatabaseStatus!='OFFLINE' and s.RecoveryMode='FULL'
    group by s.InstanceName, s.DatabaseName, s.LastLog
    having LastLog < GETDATE()-(max(e.LastLogHours)/24.0)
    order by s.InstanceName, s.DatabaseName
GO

select s.InstanceName, s.DatabaseName,

s.LastLog,

floor(convert(float,GETDATE()-LastLog)*24) as LastLogAge

from db_status s, DB_Backup_Exceptions e

where s.InstanceName like e.InstanceName and s.DatabaseName like e.DatabaseName

and (e.bestbefore is null or e.bestbefore > GETDATE())

and s.DatabaseStatus!='OFFLINE' and s.RecoveryMode='FULL'

group by s.InstanceName, s.DatabaseName, s.LastLog

having LastLog < GETDATE()-(max(e.LastLogHours)/24.0)

order by s.InstanceName, s.DatabaseName

Putting all together

Copy and paste the following to a new Transact-SQL job step in SQLAgent:

declare @tableHTML varchar(max),
        @Email varchar(255),
        @Subject varchar(255);

SET @Email = 'your_mail@yourdomain.com'
SET @Subject = 'Report backup SQL Server'

/*
######################
# REPORT BACKUP FULL #
######################
*/
set @tableHTML =
    N'<html><body><h1>Backup full not completed within the defined threshold</h1>
    <table border="1" width="100%">
      <tr><b><th>Instance</th><th>Database</th><th>LastFull</th><th>Hours since Last Full</th></tr>
'

set @tableHTML = @tableHTML + isnull(CAST((
        select
        td = s.InstanceName, '',
        td = s.DatabaseName,  '',
        td = s.LastFull,  '',
        td = floor(convert(int,convert(float,GETDATE()-LastFull)*24,0)), '
        '
from db_status s, DB_Backup_Exceptions e
    where s.InstanceName like e.InstanceName and s.DatabaseName like e.DatabaseName
    and (e.bestbefore is null or e.bestbefore > GETDATE())
    and s.DatabaseStatus!='OFFLINE'
group by s.InstanceName, s.DatabaseName, s.LastFull
having LastFull < (GETDATE()-max(e.LastFullHours)/24.0)
order by s.InstanceName, s.DatabaseName
for XML path ('tr'), TYPE) AS NVARCHAR(MAX)),'');

/*
######################
# REPORT BACKUP LOG  #
######################
*/
SET @tableHTML = @tableHTML + N'
    </table>
    <BR/><BR/>
    <h1>Backup log not completed within the defined threshold</h1>
    <table border="1" width="100%">
            <tr><b><th>Instance</th><th>Database</th><th>LastLog</th><th>Hours since Last Log</th></tr>
';

set @tableHTML = @tableHTML + isnull(CAST((
    select
        td = s.InstanceName, '',
        td = s.DatabaseName, '',
        td = s.LastLog, '',
        td = floor(convert(int,convert(float,GETDATE()-LastLog)*24,0)), '
        '
    from db_status s, DB_Backup_Exceptions e
        where s.InstanceName like e.InstanceName and s.DatabaseName like e.DatabaseName
    and (e.bestbefore is null or e.bestbefore > GETDATE())
    and s.DatabaseStatus!='OFFLINE' and s.RecoveryMode='FULL'
    group by s.InstanceName, s.DatabaseName, s.LastLog
    having LastLog < (GETDATE()-max(e.LastLogHours)/24.0)
    order by s.InstanceName, s.DatabaseName
for XML path ('tr'), TYPE) AS NVARCHAR(MAX)),'');

/*
###################
# LIST EXEPTIONS  #
###################
*/
SET @tableHTML = @tableHTML + N'
    </table>
    <BR/><BR/>
    <h1>List of exceptions currently defined</h1>
    <table border="1" width="100%">
            <tr><b><th>Instance</th><th>Database</th><th>LastFullHours</th><th>LastLogHours</th><th>Description</th><th>BestBefore</th></tr>
';

set @tableHTML = @tableHTML + CAST((
    select
        td = InstanceName, '',
        td = DatabaseName, '',
        td = LastFullHours, '',
        td = LastLogHours, '',
        td = Description, '',
        td = ISNULL(convert(varchar,bestbefore,127), 'never'), ''
    from DB_Backup_Exceptions
    where (bestbefore is null or bestbefore > GETDATE())
    order by InstanceName, DatabaseName
for XML path ('tr'), TYPE) AS NVARCHAR(MAX));

SET @tableHTML = @tableHTML + N'
    </table>
<br/><br/>
<font size="3">Information message. Please do not reply to this email.</h1></body></html>';

EXEC msdb.dbo.sp_send_dbmail
@recipients        = @Email,
@subject        = @Subject,
@body            = @tableHTML,
@body_format    = 'HTML',
@profile_name    = 'DBA'

100

declare @tableHTML varchar(max),

@Email varchar(255),

@Subject varchar(255);

SET @Email = 'your_mail@yourdomain.com'

SET @Subject = 'Report backup SQL Server'

######################

# REPORT BACKUP FULL #

######################

set @tableHTML =

N'<html><body><h1>Backup full not completed within the defined threshold</h1>

<tr><b><th>Instance</th><th>Database</th><th>LastFull</th><th>Hours since Last Full</th></tr>

set @tableHTML = @tableHTML + isnull(CAST((

select

td = s.InstanceName, '',

td = s.DatabaseName, '',

td = s.LastFull, '',

td = floor(convert(int,convert(float,GETDATE()-LastFull)*24,0)), '

from db_status s, DB_Backup_Exceptions e

where s.InstanceName like e.InstanceName and s.DatabaseName like e.DatabaseName

and (e.bestbefore is null or e.bestbefore > GETDATE())

and s.DatabaseStatus!='OFFLINE'

group by s.InstanceName, s.DatabaseName, s.LastFull

having LastFull < (GETDATE()-max(e.LastFullHours)/24.0)

order by s.InstanceName, s.DatabaseName

for XML path ('tr'), TYPE) AS NVARCHAR(MAX)),'');

######################

# REPORT BACKUP LOG #

######################

SET @tableHTML = @tableHTML + N'

</table>

<h1>Backup log not completed within the defined threshold</h1>

<tr><b><th>Instance</th><th>Database</th><th>LastLog</th><th>Hours since Last Log</th></tr>

set @tableHTML = @tableHTML + isnull(CAST((

select

td = s.InstanceName, '',

td = s.DatabaseName, '',

td = s.LastLog, '',

td = floor(convert(int,convert(float,GETDATE()-LastLog)*24,0)), '

from db_status s, DB_Backup_Exceptions e

where s.InstanceName like e.InstanceName and s.DatabaseName like e.DatabaseName

and (e.bestbefore is null or e.bestbefore > GETDATE())

and s.DatabaseStatus!='OFFLINE' and s.RecoveryMode='FULL'

group by s.InstanceName, s.DatabaseName, s.LastLog

having LastLog < (GETDATE()-max(e.LastLogHours)/24.0)

order by s.InstanceName, s.DatabaseName

for XML path ('tr'), TYPE) AS NVARCHAR(MAX)),'');

###################

# LIST EXEPTIONS #

###################

SET @tableHTML = @tableHTML + N'

</table>

<h1>List of exceptions currently defined</h1>

<tr><b><th>Instance</th><th>Database</th><th>LastFullHours</th><th>LastLogHours</th><th>Description</th><th>BestBefore</th></tr>

set @tableHTML = @tableHTML + CAST((

select

td = InstanceName, '',

td = DatabaseName, '',

td = LastFullHours, '',

td = LastLogHours, '',

td = Description, '',

td = ISNULL(convert(varchar,bestbefore,127), 'never'), ''

from DB_Backup_Exceptions

where (bestbefore is null or bestbefore > GETDATE())

order by InstanceName, DatabaseName

for XML path ('tr'), TYPE) AS NVARCHAR(MAX));

SET @tableHTML = @tableHTML + N'

</table>

<font size="3">Information message. Please do not reply to this email.</h1></body></html>';

EXEC msdb.dbo.sp_send_dbmail

@recipients = @Email,

@subject = @Subject,

@body = @tableHTML,

@body_format = 'HTML',

@profile_name = 'DBA'

Any comment appreciated!

Previous: SQLServer centralized backup monitoring with PowerShell and TSQL (1/2)

🙂

—

Ludovico

SQLServer centralized backup monitoring with PowerShell and TSQL (1/2)

Posted on May 3, 2013 by Ludovico

Checking database backups has always been one of the main concerns of DBAs. With Oracle is quite easy with a central RMAN catalog, but with other databases doing it with few effort can be a great challenge.

Some years ago I developed a little framework to control all SQLServer databases. This framework was based on Linux (strange but true!), bash, freetds, sqsh and flat configuration files. It’s still doing well its work, but not all SQLServer DBAs can deal with complex bash scripting, so a customer of mines asked me if I was able to rewrite it with a language Microsoft-like.

So I decided to go for a PowerShell script in conjunction with a couple of tables for the configuration and the data, and a simple TSQL script to provide HTML reporting. I have to say, I’m not an expert on PowerShell, but it’s far from being as flexible as other programming languages (damn, comparing to perl, python or php they have in common only the initial ‘P’). However I managed to do something usable.

The principle

This is quite simple: the PowerShell script looks up for the list of instance in a reference table, then it sequentially connect to and retrieves the data:

recovery mode
status
creation time
last full backup
last log backup

This data is merged into a table on the central repository. Finally, a TSQL script do some reporting.

Custom classes in powershell

One of the big messes with PowerShell is the lack of the definition for custom classes, this is a special mess if we consider that PowerShell is higly object-oriented. To define your own classes to work with, you have to define them in another language (C# in this example):

Add-Type @'

using System;

public class DatabaseBackup
{
    public string instanceName;
    public string databaseName;
    public string recoveryMode;
    public string status;
    public DateTime creationTime;
    public DateTime lastFull;
    public DateTime lastLog;

    private TimeSpan diff;

    public double lastFullTotalHours () {
		diff = DateTime.Now - lastFull;
		return Math.Round(diff.TotalHours,2);
    }

    public double lastLogTotalHours () {
		diff = DateTime.Now - lastLog;
		return Math.Round(diff.TotalHours,2);
    }

}
'@

Add-Type @'

using System;

public class DatabaseBackup

{

public string instanceName;

public string databaseName;

public string recoveryMode;

public string status;

public DateTime creationTime;

public DateTime lastFull;

public DateTime lastLog;

private TimeSpan diff;

public double lastFullTotalHours () {

diff = DateTime.Now - lastFull;

return Math.Round(diff.TotalHours,2);

}

public double lastLogTotalHours () {

diff = DateTime.Now - lastLog;

return Math.Round(diff.TotalHours,2);

}

For better code reading, I’ve put this definition in a separate file (DatabaseBackup.ps1).

The query that retrieves the data…

Actually I use this query to get the information:

$query_bck_database = "select 
  [Database]=d.name,LastFull,LastTran,GetDate=getdate(), 
  RecoveryMode=DATABASEPROPERTYEX(d.name, 'Recovery'),
  CreationTime=d.crdate, Status=DATABASEPROPERTYEX(d.name, 'Status')
from master.dbo.sysdatabases d
left outer join
 (select database_name, LastFull=max(backup_finish_date)
        from msdb.dbo.backupset
        where type = 'D' and backup_finish_date <= getdate()
        group by database_name
 ) b
on d.name = b.database_name
left outer join
 (select database_name, LastTran=max(backup_finish_date)
        from msdb.dbo.backupset
        where type ='L' and backup_finish_date <= getdate()
        group by database_name
 ) c
on d.name = c.database_name
 where d.name <> 'Tempdb'
order by [LastFull]";

$query_bck_database = "select

[Database]=d.name,LastFull,LastTran,GetDate=getdate(),

RecoveryMode=DATABASEPROPERTYEX(d.name, 'Recovery'),

CreationTime=d.crdate, Status=DATABASEPROPERTYEX(d.name, 'Status')

from master.dbo.sysdatabases d

left outer join

(select database_name, LastFull=max(backup_finish_date)

from msdb.dbo.backupset

where type = 'D' and backup_finish_date <= getdate()

group by database_name

) b

on d.name = b.database_name

left outer join

(select database_name, LastTran=max(backup_finish_date)

from msdb.dbo.backupset

where type ='L' and backup_finish_date <= getdate()

group by database_name

) c

on d.name = c.database_name

where d.name <> 'Tempdb'

order by [LastFull]";

I’ve also put this snippet in a separate file queries.ps1 to improve readability.

The tables

The first table (DB_Servers) can be as simple as a single column containing the instances to check. This can be any other kind of source like a corporate CMDB or similar.

The second table will contain the data collected. Off course it can be expanded!

The third table will contain some rules for managing exceptions. Such exceptions can be useful if you have situations like “all databases named northwind should not be checked”. I’ll show some examples in the next post.

CREATE TABLE [dbo].[DB_Servers] (
    [DB_Instance] [nvarchar](40) NOT NULL,
    CONSTRAINT [PK_DB_Servers] PRIMARY KEY (DB_Instance)
)
GO

CREATE TABLE [dbo].[DB_Status](
	[InstanceName] [varchar](50) NOT NULL,
	[DatabaseName] [varchar](50) NOT NULL,
	[RecoveryMode] [varchar](12) NULL,
	[DatabaseStatus] [varchar](15) NULL,
	[CreationTime] [datetime] NULL,
	[LastFull] [datetime] NULL,
	[LastLog] [datetime] NULL,
	[LastUpdate] [datetime] NULL,
    PRIMARY KEY CLUSTERED ([InstanceName] ASC,[DatabaseName] ASC)
)
GO

CREATE TABLE [dbo].[DB_Backup_Exceptions](
	[InstanceName] [varchar](50) NOT NULL,
	[DatabaseName] [varchar](50) NOT NULL,
	[LastFullHours] [int] NULL,
	[LastLogHours] [int] NULL,
	[Description] [varchar](250) NULL,
	[BestBefore] [datetime] NULL,
    PRIMARY KEY CLUSTERED ([InstanceName] ASC,[DatabaseName] ASC)
)
GO

CREATE TABLE [dbo].[DB_Servers] (

[DB_Instance] [nvarchar](40) NOT NULL,

CONSTRAINT [PK_DB_Servers] PRIMARY KEY (DB_Instance)

)

CREATE TABLE [dbo].[DB_Status](

[InstanceName] [varchar](50) NOT NULL,

[DatabaseName] [varchar](50) NOT NULL,

[RecoveryMode] [varchar](12) NULL,

[DatabaseStatus] [varchar](15) NULL,

[CreationTime] [datetime] NULL,

[LastFull] [datetime] NULL,

[LastLog] [datetime] NULL,

[LastUpdate] [datetime] NULL,

PRIMARY KEY CLUSTERED ([InstanceName] ASC,[DatabaseName] ASC)

)

CREATE TABLE [dbo].[DB_Backup_Exceptions](

[InstanceName] [varchar](50) NOT NULL,

[DatabaseName] [varchar](50) NOT NULL,

[LastFullHours] [int] NULL,

[LastLogHours] [int] NULL,

[Description] [varchar](250) NULL,

[BestBefore] [datetime] NULL,

PRIMARY KEY CLUSTERED ([InstanceName] ASC,[DatabaseName] ASC)

)

The main code

Change this to whatever you want…

 set-location "K:\TOOLS\CHECK_SQL\"

### environment to insert results/get lists
$serverName = "SERVER01\MSSQL01"
$databaseName = "Tools"

set-location "K:\TOOLS\CHECK_SQL\"

### environment to insert results/get lists

$serverName = "SERVER01\MSSQL01"

$databaseName = "Tools"

This initializes the files explained earlier:

## initialise a file with some variables containing queries (to offload the script)
. .\queries.ps1

## initialise a class to better manage database backups as objects
. .\DatabaseBackup.ps1

## initialise a file with some variables containing queries (to offload the script)

. .\queries.ps1

## initialise a class to better manage database backups as objects

. .\DatabaseBackup.ps1

This adds the required snap-in to query sqlserver

if ( (Get-PSSnapin -Name sqlserverprovidersnapin100 -ErrorAction SilentlyContinue) -eq $null ) {
    Add-PsSnapin sqlserverprovidersnapin100
}
if ( (Get-PSSnapin -Name sqlservercmdletsnapin100 -ErrorAction SilentlyContinue) -eq $null ) {
    Add-PsSnapin sqlservercmdletsnapin100
}

if ( (Get-PSSnapin -Name sqlserverprovidersnapin100 -ErrorAction SilentlyContinue) -eq $null ) {

Add-PsSnapin sqlserverprovidersnapin100

}

if ( (Get-PSSnapin -Name sqlservercmdletsnapin100 -ErrorAction SilentlyContinue) -eq $null ) {

Add-PsSnapin sqlservercmdletsnapin100

}

The following function will, given the instance, do the following:

Get the data in a ResultSet
Instantiate an instance of the DatabaseBackup class (the one we defined in the external file) for each row
Return an array of DatabaseBackup objects with all the data ready to be processed

function getDatabaseBackups ([String]$instance) {

  Write-Output "    Instance: $instance"
  $databases = invoke-sqlcmd -Query  $query_bck_database -Server $instance

  $i = 0
  foreach ( $database in $databases ) {
    $dbbck = new-object DatabaseBackup
    $dbbck.instanceName = $instance
    $dbbck.databaseName = $database.Database
    $dbbck.recoveryMode = $database.RecoveryMode
    $dbbck.creationTime = $database.CreationTime
    $dbbck.status = $database.Status
    if ( -not ( $database.IsNull("LastFull") ) ) {
      $dbbck.lastFull = $database.LastFull
    } else {
      $dbbck.lastFull = "01.01.1900 00:00:00"
    }
    if ( -not ( $database.IsNull("LastTran") ) ) {
      $dbbck.lastLog = $database.LastTran
    } else {
      $dbbck.lastLog = "01.01.1900 00:00:00"
    }

    [DatabaseBackup[]]$databasebackups += $dbbck
  }
  return $databasebackups
}

function getDatabaseBackups ([String]$instance) {

Write-Output " Instance: $instance"

$databases = invoke-sqlcmd -Query $query_bck_database -Server $instance

$i = 0

foreach ( $database in $databases ) {

$dbbck = new-object DatabaseBackup

$dbbck.instanceName = $instance

$dbbck.databaseName = $database.Database

$dbbck.recoveryMode = $database.RecoveryMode

$dbbck.creationTime = $database.CreationTime

$dbbck.status = $database.Status

if ( -not ( $database.IsNull("LastFull") ) ) {

$dbbck.lastFull = $database.LastFull

} else {

$dbbck.lastFull = "01.01.1900 00:00:00"

}

if ( -not ( $database.IsNull("LastTran") ) ) {

$dbbck.lastLog = $database.LastTran

} else {

$dbbck.lastLog = "01.01.1900 00:00:00"

}

[DatabaseBackup[]]$databasebackups += $dbbck

}

return $databasebackups

}

This is the real “main” of the script, connecting to the central instance and getting the list of the instances to check:

$Connection = New-Object System.Data.SQLClient.SQLConnection
$Connection.ConnectionString ="Server=$serverName;Database=$databaseName;trusted_connection=true;"
$Connection.Open()

$Command = New-Object System.Data.SQLClient.SQLCommand
$Command.Connection = $Connection

$instances = invoke-sqlcmd -Query "select [name]=db_instance from db_servers" -ServerInstance $serverName -Database $databasename

$Connection = New-Object System.Data.SQLClient.SQLConnection

$Connection.ConnectionString ="Server=$serverName;Database=$databaseName;trusted_connection=true;"

$Connection.Open()

$Command = New-Object System.Data.SQLClient.SQLCommand

$Command.Connection = $Connection

$instances = invoke-sqlcmd -Query "select [name]=db_instance from db_servers" -ServerInstance $serverName -Database $databasename

Finally, for each instance we have to check, we trigger the function that collects the data and we insert the results in the central repository (I’m using a merge to update the existent records).

foreach ( $instance in $instances ) {

  $databasebackups = getDatabaseBackups ($instance.name);

  $databasebackups[1..($databasebackups.length-1)] | foreach {

    $_ | select-object instanceName,databaseName

    $Command.CommandText = "MERGE DB_Status as target USING (
  select '$($_.instanceName )','$($_.databaseName )','$($_.recoveryMode )','$($_.status )','$($_.creationTime)','$($_.lastFull)','$($_.lastLog)')
  as source (InstanceName, DatabaseName, RecoveryMode, DatabaseStatus, CreationTime, LastFull, LastLog)
ON (source.InstanceName=target.InstanceName and source.DatabaseName=target.DatabaseName)
 WHEN MATCHED THEN
  UPDATE SET RecoveryMode = source.RecoveryMode, DatabaseStatus = source.DatabaseStatus, CreationTime = source.CreationTime,
   LastFull = source.LastFull, LastLog = source.LastLog, LastUpdate=getdate()
 WHEN NOT MATCHED THEN
  INSERT (InstanceName, DatabaseName, RecoveryMode, DatabaseStatus, CreationTime, LastFull, LastLog, LastUpdate)
   VALUES (source.InstanceName, source.DatabaseName,source.RecoveryMode,source.DatabaseStatus, source.CreationTime, source.LastFull,source.LastLog,getdate() );
"
    $Command.ExecuteNonQuery() | out-null
  }
  Remove-Variable databasebackups
}
$Connection.Close()

foreach ( $instance in $instances ) {

$databasebackups = getDatabaseBackups ($instance.name);

$databasebackups[1..($databasebackups.length-1)] | foreach {

$_ | select-object instanceName,databaseName

$Command.CommandText = "MERGE DB_Status as target USING (

select '$($_.instanceName )','$($_.databaseName )','$($_.recoveryMode )','$($_.status )','$($_.creationTime)','$($_.lastFull)','$($_.lastLog)')

as source (InstanceName, DatabaseName, RecoveryMode, DatabaseStatus, CreationTime, LastFull, LastLog)

ON (source.InstanceName=target.InstanceName and source.DatabaseName=target.DatabaseName)

WHEN MATCHED THEN

UPDATE SET RecoveryMode = source.RecoveryMode, DatabaseStatus = source.DatabaseStatus, CreationTime = source.CreationTime,

LastFull = source.LastFull, LastLog = source.LastLog, LastUpdate=getdate()

WHEN NOT MATCHED THEN

INSERT (InstanceName, DatabaseName, RecoveryMode, DatabaseStatus, CreationTime, LastFull, LastLog, LastUpdate)

VALUES (source.InstanceName, source.DatabaseName,source.RecoveryMode,source.DatabaseStatus, source.CreationTime, source.LastFull,source.LastLog,getdate() );

$Command.ExecuteNonQuery() | out-null

}

Remove-Variable databasebackups

}

$Connection.Close()

How to use it

Create the tables and insert your instances in the table db_servers.
Put the three files (Collect_Backup_Data.ps1,queries.ps1 and DatabaseBackup.ps1) in a directory, modify the instance name and db name in Collect_Backup_Data.ps1
Schedule the main script using the SQLAgent as a Operating system (CmdExec):

powershell.exe -file "K:\TOOLS\CHECK_SQL\Collect_Backup_Data.ps1"

1	powershell.exe -file "K:\TOOLS\CHECK_SQL\Collect_Backup_Data.ps1"

You can’t use the internal powershell of SQLServer because it’s not full compatible with powershell 2.0.
Check that the table db_status is getting populated

Limitations

The script use Windows authentication, assuming you are working with a centralized domain user. If you want to use the SQL authentication (example if you are a multi-tenant managed services provider) you need to store your passwords somewhere…
This script is intended to be used with single instances. It should works on clusters but I haven’t tested it.
Check the backup chain up to the tape library. Relying on the information contained in the msdb is not a reliable monitoring solution!!

In my next post we’ll see how to generate HTML reports via email and manage exceptions.

Hope you’ll find it useful.

Again PLEASE, if you improve it, kindly send me back a copy or blog it and post the link in the comments!

Next: SQLServer centralized backup monitoring with PowerShell and TSQL (2/2)

Cheers

—

Ludo