How cold incremental recovery saved me once

UPDATE: In the original version I was missing a few keywords: “incremental level 0” for the base backup and “resetlogs” at the database open. Thanks Gregorz for your comments.

Sorry for this “memories” post, but the technical solution at the end is worth the read, I hope 😉

Back in 2010, I was in charge of a quite complex project and faced some difficulties that led me to recover a database in a different manner. A few years have passed, but I used again the same procedure many times with full satisfaction… I think it’s worth to publish it now.

But first, let me introduce the project details and the problem.

Scope of the project

Transport a >1TB RAC database from AIX 5 on P6 to AIX 6 on P7, from a third-party datacenter in southern Italy to our main datacenter in northern Italy.
The Database featured >1000 datafiles and a huge table (800GB) partitioned by range and sub-partitioned by list (or the opposite, can’t remember).

Challenges

For budget containement, the project owner asked to avoid the use of HACMP (and thus, avoid the use of shared JFS2). I decided then to take the risk and migrate from JFS2 to ASM.

In order to avoid a few platform-related ASM bugs, I also had to upgrade from Oracle 10.2.0.3 to Oracle 10.2.0.4.

Constraints

I had no access to the source database that was 800km far from our datacenter, and I was granted only to ask for RMAN backups.

The total time of service disruption accepted was quite short (<30 minutes) considering the size and the distance of the database, and there was no direct connectivity between the sites (for political reasons).

Globally, the network throughput for sharing files over ftp was very poor.

First solution

This kind of move was very common to me, and because I was not grated to ask for a temporary Data Guard configuration, the easy solution for me was to ask:

1 – one RMAN ONLINE full backup physically sent on disk

2 – many RMAN archive backups sent over network (via ftp)

Then, on my side, restore the full backup, recover the archives sent over time and, at the date X, ask a final archive backup, ask to close the db and send the online redo logs to do a complete recovery on my side, then startup open upgrade.

Problem

I did a first “dry run” open resetlogs in order to test the procedure and make it faster, and also asked to test the application pointing to the destination database.

The very bad surprise was that the source database was doing a huge amount of nologging inserts leading to monster index corruptions after the recovery on the destination database.

ORA-26040: Data block was loaded using the NOLOGGING option

1	ORA-26040: Data block was loaded using the NOLOGGING option

According to the current database maintainer, setting the force logging on the source database was NOT an option because the SAN was not able to cope with the high redo rates.

Solution

By knowing the Oracle recovery mechanisms, I have proposed to the remote maintainer to change the recovery strategy, despite this solution was not clearly stated in the Oracle documentation:

1 – Take a first online incremental backup from the begin scn of the base full backup (thank God block change tracking was in place) and send it physically over disk

2 – Take other smaller online incremental backups, send them over ftp and apply them on the destination with “noredo”

3 – At the date X, shutdown the source, mount it and take a last incremental in mount state

4 – recover noredo the last incremental and open resetlogs the database.

According to the documentation, the “cold incremental strategy” applies if you take “cold full backups”. But from a technical point of view, taking a cold incremental and recovering it on top of a fuzzy online backup this is 100% equivalent of taking a full consistent backup in mount state.
Because all the blocks are consistent to a specific SCN, there are no fuzzy datafiles: they are recovered from incremental taken from a mounted database! This allows to do incremental recovery and open the databases without applying any single archived log and by shutting down the database only once.

Technical steps

First, take a full ONLINE backup on the source:

-- SOURCE
SQL> alter database backup controlfile to '/tmp/source/ludo.cf' reuse;

Database altered.

SQL> exit
$ rman target /
RMAN> backup incremental level 0 database as compressed backupset format '/tmp/source/%U';

-- SOURCE

SQL> alter database backup controlfile to '/tmp/source/ludo.cf' reuse;

Database altered.

SQL> exit

$ rman target /

RMAN> backup incremental level 0 database as compressed backupset format '/tmp/source/%U';

# SOURCE
scp -rp /tmp/source/ destsrv:/tmp/dest/
ludo.cf              100% |*************************************| 40944 KB    00:00
...

# SOURCE

scp -rp /tmp/source/ destsrv:/tmp/dest/

ludo.cf 100% |*************************************| 40944 KB 00:00

...

Then restore it on the destination (with no recovery):

# DEST
RMAN> restore controlfile from '/tmp/ludo.cf';

Starting restore at 11-AUG-15
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=1058 device type=DISK

channel ORA_DISK_1: copied control file copy
output file name=/.../control01.ctl
output file name=/.../control02.ctl
Finished restore at 11-AUG-15

RMAN> alter database mount;

Statement processed
released channel: ORA_DISK_1

RMAN> catalog start with '/tmp/dest/';
...
RMAN> run
2> {
3> set newname for database to '+DATA';
4>
5> restore database;
6> }
...
Finished restore at 11-AUG-15
RMAN>

# DEST

RMAN> restore controlfile from '/tmp/ludo.cf';

Starting restore at 11-AUG-15

using target database control file instead of recovery catalog

allocated channel: ORA_DISK_1

channel ORA_DISK_1: SID=1058 device type=DISK

channel ORA_DISK_1: copied control file copy

output file name=/.../control01.ctl

output file name=/.../control02.ctl

Finished restore at 11-AUG-15

RMAN> alter database mount;

Statement processed

released channel: ORA_DISK_1

RMAN> catalog start with '/tmp/dest/';

...

RMAN> run

2> {

3> set newname for database to '+DATA';

5> restore database;

6> }

...

Finished restore at 11-AUG-15

RMAN>

Then, run a COLD incremental backup on the source:

-- SOURCE
SQL> shutdown immediate;
...
ORACLE instance shut down.

SQL> startup mount
ORACLE instance started.
...
Database mounted.
SQL> exit
$ rman target /
RMAN>  BACKUP AS COMPRESSED BACKUPSET INCREMENTAL LEVEL 1 
2> CUMULATIVE DATABASE format '/tmp/source/incr%U';
...
Finished backup at 11-AUG-15
RMAN> exit
$ scp -rp /tmp/source/incr* destsrv:/tmp/dest/

-- SOURCE

SQL> shutdown immediate;

...

ORACLE instance shut down.

SQL> startup mount

ORACLE instance started.

...

Database mounted.

SQL> exit

$ rman target /

RMAN> BACKUP AS COMPRESSED BACKUPSET INCREMENTAL LEVEL 1

2> CUMULATIVE DATABASE format '/tmp/source/incr%U';

...

Finished backup at 11-AUG-15

RMAN> exit

$ scp -rp /tmp/source/incr* destsrv:/tmp/dest/

And run the incremental recovery on the source (without redo):

# DEST
RMAN> catalog start with '/tmp/dest/incr';
...
RMAN> run {
2> recover database noredo;
3> }
...
channel ORA_DISK_1: starting incremental datafile backup set restore
...
Finished recover at 11-AUG-15
RMAN> exit
$ sqlplus / as sysdba
...
SQL> alter database disable block change tracking;
Database altered.
SQL> alter database flashback off;
Database altered.
SQL> alter database flashback on;
Database altered.
SQL> create restore point PREUPG guarantee flashback database;
Restore point created.
SQL> -- open resetlogs can be avoided if I copy the online redo logs
SQL> alter database open resetlogs upgrade;
Database altered.
...
-- run catupgrd here

# DEST

RMAN> catalog start with '/tmp/dest/incr';

...

RMAN> run {

2> recover database noredo;

3> }

...

channel ORA_DISK_1: starting incremental datafile backup set restore

...

Finished recover at 11-AUG-15

RMAN> exit

$ sqlplus / as sysdba

...

SQL> alter database disable block change tracking;

Database altered.

SQL> alter database flashback off;

Database altered.

SQL> alter database flashback on;

Database altered.

SQL> create restore point PREUPG guarantee flashback database;

Restore point created.

SQL> -- open resetlogs can be avoided if I copy the online redo logs

SQL> alter database open resetlogs upgrade;

Database altered.

...

-- run catupgrd here

That’s all!

This solution gave me the opportunity to move physically the whole >1TB nologging database from one region to another one with a minimal service disruption and without touching at all the source database.

I used it many times later on, even for bigger databases and on several platforms (yes, also Windows, sigh), it works like a charm.

HTH

—

Ludovico

Bio
Latest Posts

Ludovico

Principal Product Manager at Oracle

Ludovico is a member of the Oracle Database High Availability (HA), Scalability & Maximum Availability Architecture (MAA) Product Management team in Oracle. He focuses on Oracle Data Guard, Flashback technologies, and Cloud MAA.

Latest posts by Ludovico (see all)

New views in Oracle Data Guard 23c - January 3, 2024
New in Data Guard 21c and 23c: Automatic preparation of the primary - December 22, 2023
Does FLASHBACK QUERY work across incarnations or after a Data Guard failover? - December 13, 2023

9 thoughts on “How cold incremental recovery saved me once”

Александър on January 9, 2023 at 12:44 pm said:

Hi Ludovico,

thanks sharing your experience with others. Have you used to migrate from Windows to Linux same endian ?
Thanks
Alexander

Reply ↓
- Ludovico on March 22, 2023 at 6:11 pm said:
  
  Not really 🙂
  
  Reply ↓
Gaetano on August 10, 2016 at 4:34 pm said:

Hi Ludovico,

thanks a lot for your response.
I’ll do a test on a technical DB and validate the procedure in our env.

Cheers,
Gaetano

Reply ↓
Gaetano on August 10, 2016 at 3:53 pm said:

Hi Ludovico,
thanks for sharing your stuffs.
I’am in a same situation : we need to move RAC DB from SITE A (France) to SITE B (ITALY) and I’am using RMAN fullbackup on migration link (LUN on dark fiber from France to Italy) and then restore it in Italy (because we need to change the redundancy from NORMAL to EXTERNAL -> VPLEX) and then recreate services, etc ….
I’am looking to apply your method to minimize down time because I’ve to move more then 100 big DBs.
What do you think ?
Thanks in advance for your remarks.
BR,
Gaetano

Reply ↓
- Ludovico on August 10, 2016 at 4:07 pm said:
  
  Hi Gaetano,
  if you do not have non-logged transactions and if you can “touch” the source database, the easiest way is to set up a Data Guard and perform a switchover (provided that you have Enterprise Edition). This would provide the minimal downtime.
  
  Otherwise, you can definitely use the method I have described in the post. As I said, I used it a lot and it works really well.
  Cheers
  —
  Ludovico
  
  Reply ↓
Grzegorz on March 17, 2016 at 6:53 pm said:

Thanks for the response, You rock man! 🙂

Reply ↓
Grzegorz on March 16, 2016 at 8:06 pm said:

and one more thing, does not this first level 0 backup need level 0 specified ?
G

Reply ↓
- Ludovico on March 17, 2016 at 9:34 am said:
  
  Thanks Gregorz, I wanted to see if you were paying attention 😀
  
  indeed, for the backup, either:
  – take an online backup level 0 + cold incremental level 1
  or
  – take an online full backup + cold incremental from scn
  …
  regarding the redo logs, if you copy the online logs from the source you can open without resetlogs, otherwise resetlogs is needed.
  I’ll fix it in the post.
  
  Reply ↓
Grzegorz on March 16, 2016 at 8:04 pm said:

Cool. so You were able to open database upgrade without resetlogs option ?
Regards
G

Reply ↓

DBA survival BLOG

DBA stuff and Oracle Data Guard

How cold incremental recovery saved me once

Ludovico

Latest posts by Ludovico (see all)

9 thoughts on “How cold incremental recovery saved me once”

Leave a Reply Cancel reply