Ludovico is a member of the Oracle Database High Availability (HA), Scalability & Maximum Availability Architecture (MAA) Product Management team in Oracle.
He focuses on Oracle Data Guard, Flashback technologies, and Cloud MAA.
Well, this is a completely different post from what I usually publish. I like to blog about technology, personal interests and achievements.
This time I really would like to spend a few words to praise a friend.
I met Franck Pachot for the first time back in 2012, it was my first month in Trivadis and, believe it or not, Franck was working for it as well. I have the evidence here 😉
It was the first time since years that I was meeting someone at least as smart as me on the Oracle stack (later, it happened many more times to meet smarter people, but that’s another story).
A few months later, he left Trivadis to join it’s sworn enemy dbi services. But established friendships and like mindedness don’t disappear, we continued to meet whenever an opportunity was coming up, and we started almost simultaneously to boost our blogging activities, doing public presentations and expanding our presence on the social medias (mostly Twitter).
After I’ve got my Oracle ACE status in 2014, we went together at the Oracle Open World. I used to know many folks there and I can say that I helped Franck to meet many smart people inside and outside the ACE Program. A month after the OOW, he became an Oracle ACE.
Franck’s energy, passion and devotion for the Oracle Community are endless. What he’s doing, including his last big effort, is just great and all the people in the Oracle Community respect him. I can say that now he is far more active than me in the Oracle Community (at least regarding “public” activities ;-))
We both had the target of becoming Oracle ACE Directors, and I have spent a bad month in April when I became an ACE Director and his nomination was still pending.
I said: “If you become ACE Director by the end of April I will write a blog post about you.” And that’s where this post comes from.
Congratulations ACE Director Franck, perfect timing! 🙂
Pipes, temporary files, lock files, processes spawned in background, rows inserted in a status table that need to be updated… Everything need to be cleaned up if the script exits, even when the exit condition is not triggered inside the script.
BAD:
The worst practice is, of course, to forget to cleanup the tempfiles, leaving my output and temporary directories full of files *.tmp, *.pipe, *.lck, etc. I will not show the code because the list of bad practices is quite long…
Better than forgiving to cleanup, but still very bad, is to cleanup everything just before triggering the exit command (in the following example, F_check_exit is a function that exits the script if the first argument is non-zero, as defined it in the previous episode):
Shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
...
some_command_that_must_succeed
EXITCODE=$?
if[$EXITCODE-ne0];then
# Need to exit here, but F_check_exit function does not cleanup correctly
[[$TEMPFILE]]&&[[-f$TEMPFILE]]&&rm$TMPFILE
[[$EXP_PIPE]]&&[[-f$EXP_PIPE]]&&rm$EXP_PIPE
if[$CHILD_PID];then
ps--pid$CHILD_PID>/dev/null
if[$?-eq0];then
kill$CHILD_PID# or wait, or what?
fi
fi
F_check_exit$EXITCODE"Some command that must succeed"
fi
A better approach, would be to put all the cleanup tasks in a Cleanup() function and then call this function instead of duplicating all the code everywhere:
Shell
1
2
3
4
5
...
some_command_that_must_succeed
EXITCODE=$?
[[$EXITCODE-eq0]]||Cleanup
F_check_exit$EXITCODE"Some command that must succeed"
But still, I need to make sure that I insert this piece of code everywhere. Not optimal yet.
I may include the Cleanup function inside the F_check_exit function, but then I have two inconvenients:
1 – I need to define the Cleanup function in every script that includes my include file
2 – still there will be exit conditions that are not trapped
GOOD:
The good approach would be to trap the EXIT signal with the Cleanup function:
Shell
1
2
3
4
5
6
7
8
Cleanup(){
# cleanup your stuff here
}
trap Cleanup EXIT
do_something
F_check_exit$?"Something"
Much better! But what if my include script has some logic that also creates some temporary files?
I can create a global F_Cleanup function that eventually executes the local Cleanup function, if defined. Let me show this:
Include script:
Shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
# this is the include file (e.g. $BASEBIN/Init_Env.sh)
functionF_cleanup(){
EXITCODE=$?
if[`typeset-FCleanup`];then
edebug"Cleanup function defined. Executing it..."
Cleanup$EXITCODE
edebug"Cleanup function executed with return code $?"
else
edebug"No cleanup function defined."
fi
# do other global cleanups
}
### Register the cleanup function
trap F_cleanup EXIT
Main script:
Shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Cleanup: If any function named Cleanup is defined, it will automatically be executed
# upon the EXIT signal.
Cleanup(){
if[$1-eq0];then
# exit 0 trapped
else
# exit !0 trapped
# report the error
fi
# remove pipes, temporary files etc
}
.$BASEBIN/Init_Env.sh
do_something
F_check_exit$?"Something"
The Cleanup function will be executed only if defined.
No Cleanup function: no worries, but still the F_Cleanup function can do some global cleanup not specific to the main script.
Every command in a script may fail due to external reasons. Bash programming is not functional programming! 🙂
After running a command, make sure that you check the exit code and either raise a warning or exit with an error, depending on how a failure can impact the execution of the script.
BAD:
The worst example is not to check the exit code at all:
My favorite solution is to automatically open a pipe that will receive from the standard output and redirect to the logfile. With this solution, I can programmatically define my logfile name inside the script (based on the script name and input parameters for example) and forget about redirecting the output everytime that I run a command.
Shell
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
export LOGDIR=/path/to/logfiles
export DATE=`date+"%Y%m%d"`
export DATETIME=`date+"%Y%m%d_%H%M%S"`
ScriptName=`basename$0`
Job=`basename$0.sh`"_whatever_I_want"
JobClass=`basename$0.sh`
functionLog_Open(){
if[$NO_JOB_LOGGING];then
einfo"Not logging to a logfile because -Z option specified."#(*)
Support different logging levels natively in your scripts so that your code will be more stable and maintainable.
BAD:
Shell
1
2
3
4
5
6
7
#!/bin/bash -l
...
# for debug only, comment out when OK
echo$a
do_something$a
# echo $? # sometimes does not work?
GOOD:
Nothing to invent, there are already a few blog posts around about the best practices for log messages. I personally like the one from Michael Wayne Goodman:
If you couple the verbosity level with input parameters you can have something quite clever (e.g. -s for silent, -V for verbose, -G for debug). I’m putting everything into one single snippet just as example, but as you can imagine, you should seriously put all the fixed variables and functions inside an external file that you will systematically include in your scripts:
The echo builtin command requires -e in order to make the colours work. When reading files, cat works, less requires -r. vi may work with some hacking, but it’s not worth to spend too much time, IMHO.
The main technical account (oracle here) usually has the smart environment, with aliases, scripts avilable at fingertips, correct environment variables and functions.
When working with personal accounts, it may be boring to set the new environment at each login, copy it from a golden copy or reinvent the wheel everytime.
Make your common environment as smart as possible. If any commands need to be run differently depending on the user (oracle or not oracle), just use a simple if:
Shell
1
2
3
4
5
if[$USER!="oracle"];then
aliasvioratab='sudoedit -u oracle $ORATAB'
else
aliasvioratab='vi $ORATAB'
fi
The goal of course is to avoid as many types as you can, and let all your colleagues profit of the smart environment.
This is the first episode of a mini series of Bash tips for Linux (in case you are wondering, yes, they are respectively my favorite shell and my favorite OS 😉 ).
Nowadays it is mandatory at many companies to log in on Linux servers with a personal account (either integrated with LDAP, kerberos or whatelse) to comply with strict auditing rules.
I need to be sure that I have an environment where my modifications do not conflict with my colleagues environment.
BAD:
Shell
1
2
3
4
5
6
7
8
9
10
11
12
-bash-4.1$id
uid=20928(ludo)gid=200(dba)groups=200(dba)
-bash-4.1$ls-lia
total8
8196drwxrwxr-x2oracle dba4096Mar1515:14.
2drwxrwxrwt.14root root4096Mar1515:15..
-bash-4.1$viscript.sh
...edit here...
-bash-4.1$ls-l
total4
-rw-r--r--1ludo dba8Mar1515:15script.sh
-bash-4.1$
the script has been created by me, but my colleagues may need to modify it! So I need to change the ownership:
If I really want to change the owner, I have to ask to someone that has root privileges or delete the file with my account and create it with the correct one (oracle or something else).
GOOD:
Set the setgid bit at the directory level
Define an alias for my favorite editor that use sudoedit instead:
UPDATE: In the original version I was missing a few keywords: “incremental level 0” for the base backup and “resetlogs” at the database open. Thanks Gregorz for your comments.
Sorry for this “memories” post, but the technical solution at the end is worth the read, I hope 😉
Back in 2010, I was in charge of a quite complex project and faced some difficulties that led me to recover a database in a different manner. A few years have passed, but I used again the same procedure many times with full satisfaction… I think it’s worth to publish it now.
But first, let me introduce the project details and the problem.
Scope of the project
Transport a >1TB RAC database from AIX 5 on P6 to AIX 6 on P7, from a third-party datacenter in southern Italy to our main datacenter in northern Italy.
The Database featured >1000 datafiles and a huge table (800GB) partitioned by range and sub-partitioned by list (or the opposite, can’t remember).
Challenges
For budget containement, the project owner asked to avoid the use of HACMP (and thus, avoid the use of shared JFS2). I decided then to take the risk and migrate from JFS2 to ASM.
In order to avoid a few platform-related ASM bugs, I also had to upgrade from Oracle 10.2.0.3 to Oracle 10.2.0.4.
Constraints
I had no access to the source database that was 800km far from our datacenter, and I was granted only to ask for RMAN backups.
The total time of service disruption accepted was quite short (<30 minutes) considering the size and the distance of the database, and there was no direct connectivity between the sites (for political reasons).
Globally, the network throughput for sharing files over ftp was very poor.
First solution
This kind of move was very common to me, and because I was not grated to ask for a temporary Data Guard configuration, the easy solution for me was to ask:
1 – one RMAN ONLINE full backup physically sent on disk
2 – many RMAN archive backups sent over network (via ftp)
Then, on my side, restore the full backup, recover the archives sent over time and, at the date X, ask a final archive backup, ask to close the db and send the online redo logs to do a complete recovery on my side, then startup open upgrade.
Problem
I did a first “dry run” open resetlogs in order to test the procedure and make it faster, and also asked to test the application pointing to the destination database.
The very bad surprise was that the source database was doing a huge amount of nologging inserts leading to monster index corruptions after the recovery on the destination database.
According to the current database maintainer, setting the force logging on the source database was NOT an option because the SAN was not able to cope with the high redo rates.
Solution
By knowing the Oracle recovery mechanisms, I have proposed to the remote maintainer to change the recovery strategy, despite this solution was not clearly stated in the Oracle documentation:
1 – Take a first online incremental backup from the begin scn of the base full backup (thank God block change tracking was in place) and send it physically over disk
2 – Take other smaller online incremental backups, send them over ftp and apply them on the destination with “noredo”
3 – At the date X, shutdown the source, mount it and take a last incremental in mount state
4 – recover noredo the last incremental and open resetlogs the database.
According to the documentation, the “cold incremental strategy” applies if you take “cold full backups”. But from a technical point of view, taking a cold incremental and recovering it on top of a fuzzy online backup this is 100% equivalent of taking a full consistent backup in mount state.
Because all the blocks are consistent to a specific SCN, there are no fuzzy datafiles: they are recovered from incremental taken from a mounted database! This allows to do incremental recovery and open the databases without applying any single archived log and by shutting down the database only once.
SQL>-- open resetlogs can be avoided if I copy the online redo logs
SQL>alterdatabaseopenresetlogsupgrade;
Databasealtered.
...
-- run catupgrd here
That’s all!
This solution gave me the opportunity to move physically the whole >1TB nologging database from one region to another one with a minimal service disruption and without touching at all the source database.
I used it many times later on, even for bigger databases and on several platforms (yes, also Windows, sigh), it works like a charm.
Using the RMAN catalog is an option. There is a long discussion between DBAs on whether should you use the catalog or not.
But because I like (a lot) the RMAN catalog and I generally use it, I assume that most of you do it 😉
When you want to restore from the RMAN catalog, you need to get the DBID of the database you want to restore and, sometimes, also the incarnation key.
The DBID is used to identify the database you want to restore. The DBID is different for every newly created / duplicated database, but beware that if you duplicate your database manually (using restore/recover), you actually need to change your DBID using the nid tool, otherwise you will end up by having more than one database registered in the catalog with the very same DBID. This is evil! The DB_NAME is also something that you may want to make sure is unique within your database farm.
The Incarnation Key changes whenever you do an “open resetlogs”, following for example a flashback database, an incomplete recovery, or just a “open resetlogs” without any specific need.
In the image, you can see that you may want to restore to a point in time after the open resetlogs (blue incarnation) or before it (red incarnation). Depending on which one you need to restore, you may need to use the command RESET DATABASE TO INCARNATION.
If you have a dynamic and big environment, you probably script your restores procedures, that’s why getting the DBID and incarnation key using the RMAN commands may be more complex than just querying the catalog using sqlplus.
How do I get the history of my database incarnations?
You can get it easily for all your databases using the handy hierarchical queries on the RMAN catalog (db names and ids are obfuscated for obvious reasons):
So, if I need to restore the database 1465419F until time 2016-01-20 00:00:00, i need to set DBID=1048383773 and reset the database to incarnation 1256014297.
Cheers
—
Ludo
This website uses cookies to improve your experience. We'll assume you're ok with this, but you can opt-out if you wish.Accept
Privacy & Cookies Policy
Privacy Overview
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. This category only includes cookies that ensures basic functionalities and security features of the website. These cookies do not store any personal information.
Any cookies that may not be particularly necessary for the website to function and is used specifically to collect user personal data via analytics, ads, other embedded contents are termed as non-necessary cookies. It is mandatory to procure user consent prior to running these cookies on your website.