RMAN Catalog Housekeeping: how to purge the old incarnations

First, let me apologize because every post in my blog starts with a disclaimer… but sometimes it is really necessary. 😉

Disclaimer: this blog post contains PL/SQL code that deletes incarnations from your RMAN recovery catalog. Please DON’T use it unless you deeply understand what you are doing, as it can compromise your backup and recovery strategy.

Small introduction

You may have a central RMAN catalog that stores all the backup metadata for your databases. If it is the case, you will have a database entry for each of your databases and a new incarnation entry for each duplicate, incomplete recovery or  flashback (or whatever).

You should also have a delete strategy that deletes the obsolete backups from either your DISK or SBT_TAPE media. If you have old incarnations, however, after some time you will notice that their information never goes away from your catalog, and you may end up soon or later to do some housekeeping. But there is nothing more tedious than checking and deleting the incarnations one by one, especially if you have average big numbers like this catalog:

Where db, dbinc, bdf and brl contain reslectively the registered databases, incarnations, datafile backups and archivelog backups.

Different incarnations?

Consider the following query:

You can run it safely: it returns the list of incarnations hierarchically connected to their parent, by database name, key and level.

Then you have several types of behaviors:

  • Normal databases (created once, never restored or flashed back) will have just one or two incarnations (it depends on how they are created):

They are usually the ones that you may want to keep in your catalog, unless the database no longer exist: in this case perhaps you omitted the deletion from the catalog when you have dropped your database?

  • Flashed back databases (flashed back multiple times) will have as many incarnations as the number of flashbacks, but all connected with the incarnation prior to the flashback:

Here, despite you have several incarnations, they all belong to the same database (same DB_KEY and DBID), then you must also keep it inside the recovery catalog.

  • Non-production databases that are frequently refreshed from the production database (via duplicate) will have several incarnations with different DBIDs and DB_KEY:

This is usually the most frequent case: here you want to delete the old incarnations, but only as far as there are no backups attached to them that are still in the recovery window.

  • You may also have orphaned incarnations:

In this case, again, it depends whether the DBID and DB_KEY are the same as the current incarnation or not.

What do you need to delete?

Basically:

  • Incarnations of databases that no longer exist
  • Incarnations of existing databases where the database has a more recent current incarnation, only if there are no backups still in the retention window

How to do it?

In order to be sure 100% that you can delete an incarnation, you have to verify that there are no recent backups (for instance, no backups more rercent than the current recovery window for that database). If the database does not have a specified recovery window but rather a default “CONFIGURE RETENTION POLICY TO REDUNDANCY 1; # default”, it is a bit more problematic… in this case let’s assume that we consider “old” an incarnation that does not backup since 1 year (365 days), ok?

Getting the last backup of each database

Sadly, there is not a single table where you can verify that. You have to collect the information from several tables. I think bdf, al, cdf, bs would suffice in most cases.

When you delete an incarnation you specify a db_key: you have to get the last backup for each db_key, with queries like this:

Putting together all the tables:

Getting the  recovery window

The configuration information for each database is stored inside the conf table, but the retention information is stored in a VARCHAR2, either ‘TO RECOVERY WINDOW OF % DAYS’ or ‘TO REDUNDANCY %’

You need to convert it to a number when the retention policy is recovery windows, otherwise you default it to 365 days wher the redundancy is used. You can add a column and a join to the query:

and eventually, either display if it the incarnation is no more used or filter by usage:

Delete the incarnations!

You can delete the incarnations with this procedure:

This procedure will raise an exception (-20001, ‘Database not found’) when a database does not exist anymore (either already deleted by this procedure or by another session), so you need to handle it.

Putting all together:

I have used this procedure today for the first time and it worked like a charm.

However, if you have any adjustment or suggestion, don’t hesitate to comment it 🙂

HTH

Getting the DBID and Incarnation from the RMAN Catalog

Using the RMAN catalog is an option. There is a long discussion between DBAs on whether should you use the catalog or not.

But because I like (a lot) the RMAN catalog and I generally use it, I assume that most of you do it 😉

When you want to restore from the RMAN catalog, you need to get the DBID of the database you want to restore and, sometimes, also the incarnation key.

The DBID is used to identify the database you want to restore. The DBID is different for every newly created / duplicated database, but beware that if you duplicate your database manually (using restore/recover), you actually need to change your DBID using the nid tool, otherwise you will end up by having more than one database registered in the catalog with the very same DBID. This is evil! The DB_NAME is also something that you may want to make sure is unique within your database farm.

The Incarnation Key changes whenever you do an “open resetlogs”, following for example a flashback database, an incomplete recovery, or just a “open resetlogs” without any specific need.

2016-02-15 09_43_34-Sametime Appshare Highlighter

In the image, you can see that you may want to restore to a point in time after the open resetlogs (blue incarnation) or before it (red incarnation). Depending on which one you need to restore, you may need to use the command RESET DATABASE TO INCARNATION.

https://docs.oracle.com/database/121/RCMRF/rcmsynta2007.htm#RCMRF148

If you have a dynamic and big environment, you probably script your restores procedures, that’s why getting the DBID and incarnation key using the RMAN commands may be more complex than just querying the catalog using sqlplus.

How do I get the history of my database incarnations?

You can get it easily for all your databases using the handy hierarchical queries on the RMAN catalog (db names and ids are obfuscated for obvious reasons):

What about getting the correct DBID/DBINC_KEY pair for a specific database/time?

You can get the time windows for each incarnation using the lead() analytical function:

With this query, you can see that every incarnation has a reset time and a “next reset time”.

It’s easy then to get exactly what you need by adding a couple of where clauses:

So, if I need to restore the database 1465419F until time 2016-01-20 00:00:00, i need to set DBID=1048383773 and reset the database to incarnation 1256014297.

Cheers

Ludo

Oracle Database 12c: RMAN recover at table level

Brett Jordan David MacdonaldOracle Database 12c comes with a new feature named “RMAN table level recovery”.

After a quick try it’s easy to understand that we are talking about Tablespace Point-in-Time Recovery (TSPITR) with some automation to have it near-transparent.

 

How to launch it

The syntax is quite trivial. Suppose you’ve dropped a table ludovico.reco and then purged it (damn!) then you can’t flashback it to before drop and don’t want to flashback the entire database.

 

You can recover the table with:

 

You identify the schema.table:partition to restore, optionally you can pass the pluggable database containing the table to recover, the time definition as usual (scn, seq# or timestamp) and an auxiliary destination.

This Auxiliary destination is well-known to be mandatory for TSPITR. You can pass other options like table renaming or tablespace remapping.

Off course, the database must be open in read-write, in archivelog mode and at least one successful backup must be taken.

How it works

Oracle prepare an auxiliary instance by restoring the SYSTEM, UNDO and SYSAUX tablespaces.

Then it opens in READ-ONLY mode the partial database.

 

It uses then the read-only dictionary to take the tablespace that was containing the table before the data loss. This tablespace (users in my example) is restored and recovered, and the database is opened.

 

At this  point, RMAN starts an export/import with datapump to move the table from the auxiliary database back to the target database:

 

Finally, the auxiliary instance is cleaned:

 

We can check if our table is ok:

 

Oh, and yes, now we can select directly from RMAN! 🙂

 

 My opinion

  • It still needs the amount of space needed to recover the auxiliary instance (system, sysaux, temp and the user tablespace containing the missing data), so it has all the defeats of the typical TSPITR, but it’s automatic so is an improvement for the real life.
  • Restoring the user tablespace separately from the system tablespaces can be an issue if you’re saving backupsets over tape: you can end up by reading twice the same backupset that could be read once instead.

Cheers

Ludovico

Script that duplicates a database using a physical standby RAC as source

 It’s possibile to duplicate a database for testing purposes (it’s an example) using a standby database as source. This allows you to off-load the production environment.

This is a simple script that makes use of ASM and classic duplicate, although I guess it’s possible to use the standby DB for a duplicate from active database.
You can launch it everyday to align your test env at a point in time.

Dog eat Dog… Oracle deletes itself by mistake!

While implementing the backup on a new DB inherited from a customer, I scheduled our standard backup “type disk” procedure through rman, on Windows.
The morning after I saw that the “delete obsolete” tried to delete ALL CURRENT DATAFILES!!

i criteri di conservazione RMAN verranno applicati al comando
i criteri di conservazione RMAN sono impostati su una ridondanza 1
canale allocato: ORA_DISK_1
canale ORA_DISK_1: sid=29 devtype=DISK
Eliminazione dei seguenti backup e copie obsoleti:
Tipo Chiave Ora fine Nome file/Handle
-------------------- ------ ------------------ --------------------
Set di backup 917 28-GIU-11
...
Set di backup 927 29-GIU-11
Backup piece 1005 29-GIU-11 H:\ORACLE\BACKUP\ORAPERSP\RMAN\SPFILEBCK_20110629
Copia file di dati 14 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\INDX01.DBF
Copia file di dati 16 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TOOLS01.DBF
Copia file di dati 17 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\USERS01.DBF
Copia file di dati 18 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\DRSYS01.DBF
Copia file di dati 19 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\EXAMPLE01.DBF
Copia file di dati 20 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\ODM01.DBF
Copia file di dati 21 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\XDB01.DBF
Copia file di dati 22 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\CWMLITE01.DBF
Copia file di dati 23 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TBLDATI01.ORA
Copia file di dati 24 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TBLINDEX01.ORA
Copia file di dati 25 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\OEM_REPOSITORY1.ORA
Copia file di dati 26 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\SYSTEM01.DBF
Copia file di dati 27 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\UNDOTBS01.DBF
backup piece eliminata
...
backup piece eliminata
handle di backup piece=H:\ORACLE\BACKUP\ORAPERSP\RMAN\C-2220366420-20110628-02 recid=990 stamp=755031582
backup piece eliminata
handle di backup piece=H:\ORACLE\BACKUP\ORAPERSP\RMAN\C-2220366420-20110629-00 recid=1002 stamp=755130872
backup piece eliminata
handle di backup piece=H:\ORACLE\BACKUP\ORAPERSP\RMAN\CTL_20110629 recid=1004 stamp=755130883
backup piece eliminata
handle di backup piece=H:\ORACLE\BACKUP\ORAPERSP\RMAN\SPFILEBCK_20110629 recid=1005 stamp=755130885
RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03009: failure of delete command on ORA_DISK_1 channel at 06/29/2011 22:34:55
ORA-19584: file E:\ORACLE\ORADATA\ORAPERSP\INDX01.DBF già in usoRecovery Manager ha terminato.

That’s because all current datafiles were registered into recovery catalog as backup copy. With a recovery redundancy of 1, all datafiles were set as obsolete! But since it’s windows, a delete command doesn’t delete datafiles if they are already in use. What it was on unix? We had just luck!

Then we had to uncatalog all copies.


RMAN> list copy;

la specifica non corrisponde a nessuno dei log di archivio del Recovery Catalog

Lista di copie del file di dati
Chiave SCN Ckp file S Ora di completamento Nome Ora ckp
------- ---- - -------------------- ---------- -------------------- ----
26 1 X 29-NOV-10 18535127593 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\SYSTEM01.DBF
27 2 X 29-NOV-10 18535127762 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\UNDOTBS01.DBF
14 3 X 29-NOV-10 18535122625 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\INDX01.DBF
16 4 X 29-NOV-10 18535123721 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TOOLS01.DBF
17 5 X 29-NOV-10 18535124423 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\USERS01.DBF
18 6 X 29-NOV-10 18535124439 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\DRSYS01.DBF
19 7 X 29-NOV-10 18535124453 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\EXAMPLE01.DBF
20 8 X 29-NOV-10 18535124554 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\ODM01.DBF
21 9 X 29-NOV-10 18535125790 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\XDB01.DBF
22 10 X 29-NOV-10 18535125874 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\CWMLITE01.DBF
23 11 X 29-NOV-10 18535125887 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TBLDATI01.ORA
24 12 X 29-NOV-10 18535126750 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\TBLINDEX01.ORA
25 13 X 29-NOV-10 18535127211 29-NOV-10 E:\ORACLE\ORADATA\ORAPERSP\OEM_REPOSITORY1.ORA


RMAN> change copy of datafile 1..N uncatalog;

copia non catalogata del file di dati
filename di copia del file di dati=E:\ORACLE\ORADATA\ORAPERSP\INDX01.DBF recid=14 stamp=736336991
Oggetti 1 non catalogati
...

until no “obsolete” current datafile were reported!


RMAN> report obsolete;

i criteri di conservazione RMAN verranno applicati al comando
i criteri di conservazione RMAN sono impostati su una ridondanza 1
non sono stati trovati backup obsoleti

Lesson learned: never schedule delete obsolete without actually checking what could be deleted!