How cold incremental recovery saved me once

UPDATE: In the original version I was missing a few keywords: “incremental level 0” for the base backup and “resetlogs” at the database open. Thanks Gregorz for your comments.

Sorry for this “memories” post, but the technical solution at the end is worth the read, I hope 😉

Back in 2010, I was in charge of a quite complex project and faced some difficulties that led me to recover a database in a different manner. A few years have passed, but I used again the same procedure many times with full satisfaction… I think it’s worth to publish it now.

But first, let me introduce the project details and the problem.

 

Scope of the project

Transport a >1TB RAC database from AIX 5 on P6 to AIX 6 on P7, from a third-party datacenter in southern Italy to our main datacenter in northern Italy.
The Database featured >1000 datafiles and a huge table (800GB) partitioned by range and sub-partitioned by list (or the opposite, can’t remember).

 

Challenges

For budget containement, the project owner asked to avoid the use of HACMP (and thus, avoid the use of shared JFS2). I decided  then to take the risk and migrate from JFS2  to ASM.

In order to avoid a few platform-related ASM bugs, I also had to upgrade from Oracle 10.2.0.3 to Oracle 10.2.0.4.

 

Constraints

I had no access to the source database that was 800km far from our datacenter, and I was granted only to ask for RMAN backups.

The total time of service disruption accepted was quite short (<30 minutes) considering the size and the distance of the database, and there was no direct connectivity between the sites (for political reasons).

Globally, the network throughput for sharing files over ftp was very poor.

 

First solution

This kind of move was very common to me, and because I was not grated to ask for a temporary Data Guard configuration, the easy solution for me was to ask:

1 – one RMAN ONLINE full backup physically sent on disk

2 – many RMAN archive backups sent over network (via ftp)

Then, on my side, restore the full backup, recover the archives sent over time and, at the date X, ask a final archive backup, ask to close the db and send the online redo logs to do a complete recovery on my side, then startup open upgrade.

 

Problem

I did a first “dry run” open resetlogs in order to test the procedure and make it faster, and also asked to test the application pointing to the destination database.

The very bad surprise was that the source database was doing a huge amount of nologging inserts leading to monster index corruptions after the recovery on the destination database.

According to the current database maintainer, setting the force logging on the source database was NOT an option because the SAN was not able to cope with the high redo rates.

 

Solution

By knowing the Oracle recovery mechanisms, I have proposed to the remote maintainer to change the recovery strategy, despite this solution was not clearly stated in the Oracle documentation:

1 – Take a first online incremental backup from the begin scn of  the base full backup (thank God block change tracking was in place) and send it physically over disk

2 – Take other smaller online  incremental backups, send them over ftp  and apply them on the destination with “noredo”

3 – At the date X, shutdown the source, mount it and take a last incremental in mount state

4 – recover noredo the last incremental and open resetlogs the database.

According to the documentation, the “cold incremental strategy” applies if you take “cold full backups”. But from a technical point of view, taking a cold incremental and recovering it on top of a fuzzy online backup this is 100% equivalent of taking a full consistent backup in mount state.
Because all the blocks are consistent to a specific SCN, there are no fuzzy datafiles:  they are recovered from incremental taken from a mounted database! This allows to do incremental recovery and open the databases without applying any single archived log and by shutting down the database only once.

 

Technical steps

First, take a  full ONLINE backup on the source:

Then restore it on the destination (with no recovery):

Then, run a COLD incremental backup on the source:

And run the incremental recovery on the source (without redo):

That’s all!

This solution gave me the opportunity to move physically the whole >1TB nologging database from one region to another one with a minimal service disruption and without touching at all the source database.

I used it many times later on, even for bigger databases and on several platforms (yes, also Windows, sigh), it works like a charm.

HTH

Ludovico

The following two tabs change content below.

Ludovico

Principal Product Manager at Oracle
Ludovico is a member of the Oracle Database High Availability (HA), Scalability & Maximum Availability Architecture (MAA) Product Management team in Oracle. He focuses on Oracle Data Guard, Flashback technologies, and Cloud MAA.

9 thoughts on “How cold incremental recovery saved me once

  1. Hi Ludovico,

    thanks sharing your experience with others. Have you used to migrate from Windows to Linux same endian ?
    Thanks
    Alexander

  2. Hi Ludovico,

    thanks a lot for your response.
    I’ll do a test on a technical DB and validate the procedure in our env.

    Cheers,
    Gaetano

  3. Hi Ludovico,
    thanks for sharing your stuffs.
    I’am in a same situation : we need to move RAC DB from SITE A (France) to SITE B (ITALY) and I’am using RMAN fullbackup on migration link (LUN on dark fiber from France to Italy) and then restore it in Italy (because we need to change the redundancy from NORMAL to EXTERNAL -> VPLEX) and then recreate services, etc ….
    I’am looking to apply your method to minimize down time because I’ve to move more then 100 big DBs.
    What do you think ?
    Thanks in advance for your remarks.
    BR,
    Gaetano

    • Hi Gaetano,
      if you do not have non-logged transactions and if you can “touch” the source database, the easiest way is to set up a Data Guard and perform a switchover (provided that you have Enterprise Edition). This would provide the minimal downtime.

      Otherwise, you can definitely use the method I have described in the post. As I said, I used it a lot and it works really well.
      Cheers

      Ludovico

    • Thanks Gregorz, I wanted to see if you were paying attention 😀

      indeed, for the backup, either:
      – take an online backup level 0 + cold incremental level 1
      or
      – take an online full backup + cold incremental from scn

      regarding the redo logs, if you copy the online logs from the source you can open without resetlogs, otherwise resetlogs is needed.
      I’ll fix it in the post.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.