Recording of “Rapid Home Provisioning” webinar for the RAC SIG

Yesterday I have presented the Oracle Rapid Home Provisioning technology for the RAC SIG, you can find the recording on YouTube:

Cheers

Ludo

Rapid Home Provisioning

In a few days I will give a presentation at UKOUG Tech15 about Rapid Home Provisioning, it will be the first time that I present this session in public.

I usually like to give the link to the material to my audience, so here we go:

Slides:

Demo:

Enjoy

Ludovico

Oracle Active Data Guard and Global Data Services in Action!

In a few days I will give a presentation at UKOUG Tech15 about Global Data Services, it will be the first time that I present this session.

I usually like to give the link to the material to my audience, so here we go:

Credits

I have to give special credits to my colleague Robert Bialek. I’ve got a late confirmation for this session and my slide deck was not ready at all, so I have used a big part of his original work. Most of the content included in the slides has been created by Robert, not me. (Thank you for your help! :-))

Slides

Demo recording

Demo script

And the script to revert the demo:

Cheers

Ludovico

 

Oracle Database on ACFS: a perfect marriage?

Update: I will give this presentation at UKOUG Tech15, Wed 9 December at 14:30.

This presentation has had a very poor score in selections for conferences (no OOW, no DOAG) but people liked it very much at Paris Oracle Meetup.  The Database on ACFS is mainstream now, thanks to the new ODA releases. Having some knowledge about why and how you should run (not) Databases on ACFS is definitely worth a read.

Slides

Demo 1 recording

Demo 2 recording

Demo script (DB ACFS clone from Standby Database)

 

Comments are, as always, very appreciated 🙂

Ludo

Migrating Oracle RAC from SuSE to OEL (or RHEL) live

I have a customer that needs to migrate its Oracle RAC cluster from SuSE to OEL.

I know, I know, there is a paper from Dell and Oracle named:

How Dell Migrated from SUSE Linux to Oracle Linux

That explains how Dell migrated its many RAC clusters from SuSE to OEL. The problem is that they used a different strategy:

– backup the configuration of the nodes
– then for each node, one at time
– stop the node
– reinstall the OS
– restore the configuration and the Oracle binaries
– relink
– restart

What I want to achieve instead is:
add one OEL node to the SuSE cluster as new node
– remove one SuSE node from the now-mixed cluster
– install/restore/relink the RDBMS software (RAC) on the new node
– move the RAC instances to the new node (taking care to NOT run more than the number of licensed nodes/CPUs at any time)
– repeat (for the remaining nodes)

because the customer will also migrate to new hardware.

In order to test this migration path, I’ve set up a SINGLE NODE cluster (if it works for one node, it will for two or more).

I have to setup the new node addition carefully, mainly as I would do with a traditional node addition:

  • Add new ip addresses (public, private, vip) to the DNS/hosts
  • Install the new OEL server
  • Keep the same user and groups (uid, gid, etc)
  • Verify the network connectivity and setup SSH equivalence
  • Check that the multicast connection is ok
  • Add the storage, configure persistent naming (udev) and verify that the disks (major, minor, names) are the very same
  • The network cards also must be the very same

Once the new host ready, the cluvfy stage -pre nodeadd will likely fail due to

  • Kernel release mismatch
  • Package mismatch

Here’s an example of output:

So the problem is not if the check succeed or not (it will not), but what fails.

Solving all the problems not related to the difference SuSE-OEL is crucial, because the addNode.sh will fail with the same errors.  I need to run it using -ignorePrereqs and -ignoreSysPrereqs switches. Let’s see how it works:

Then, as stated by the addNode.sh, I run the root.sh and I expect it to work:

Bingo! Let’s check if everything is up and running:

So yes, it works, but remember that it’s not a supported long-term configuration.

In my case I expect to migrate the whole cluster from SLES to OEL in one day.

NOTE: using OEL6 as new target is easy because the interface names do not change. The new OEL7 interface naming changes, if you need to migrate without cluster downtime you need to setup the new OEL7 nodes following this post: http://ask.xmodulo.com/change-network-interface-name-centos7.html

Otherwise, you need to configure a new interface name for the cluster with oifcfg.

HTH

Ludovico

Get information about Cursor Sharing for a SQL_ID

Yesterday I’ve got a weird problem with Adaptive Cursor Sharing. I’m not sure yet about the issue, but it seems to be related to cursor sharing histograms. Hopefully one day I will blog about what I’ve learnt from this experience.

To better monitor the problem on that specific query, I’ve prepared this script (tested on 12.1.0.2):

The result is something similar (in my case it has 26 child cursors):

It’s a quick way to get the relevant information in a single result.

Off course, if you need deeper details, you should consider something more powerful like SQLd360 from Mauro Pagano.

Credits: I’ve got the unpivot idea (and copied that part of the code) from this post by Timur Akhmadeev.

Ludo

Querying the dba_hist_sys_time_model to get historical data

This quick post is mainly for myself… I will certainly use it for reference in the future.

Debugging problems due to adaptive dynamic sampling and in general adaptive features sometimes needs to get historical data about, e.g., parse time.

In order to get this information you may need to query the view DBA_HIST_SYS_TIME_MODEL (take care, it needs Diagnostic Pack license!)

You can use this query as an example.

 

In this specific example, it shows the “parse time elapsed”, the “DB time” and the percentage parse/dbtime, along with the value of the parameter “optimizer_adaptive_features“. You can use it to check if changing the parameters related to adaptive dynamic sampling improves or not the parse time.

The output will be something like this:

HTH

Ludo

Grid Infrastructure 12c: Recovering the GRID Disk Group and recreating the GIMR

Losing the Disk Group that contains OCR and voting files has always been a challenge. It requires you to take regular backups of OCR, spfile and diskgroup metadata.

Since Oracle 12cR1, there are a few additional components you must take care of:

– The ASM password file (if you have Flex ASM it can be quite critical)

– The Grid Infrastructure Management Repository

Why ASM password file is important? Well, you can read this good blog post form my colleague Robert Bialek: http://blog.trivadis.com/b/robertbialek/archive/2014/10/26/are-you-using-oracle-12c-flex-asm-if-yes-do-you-have-asm-password-file-backup.aspx

So the problem here, is not whether you should back them up or not, but how you can restore them quickly.

Assumptions: you back up regularly:

ASM parameter  file:

Oracle Cluster Registry:

ASM Diskgroup Metadata:

ASM password file:

What about the GIMR?

According to the MOS Note: FAQ: 12c Grid Infrastructure Management Repository (GIMR) (Doc ID 1568402.1), there is no such need for the moment.

Weird, huh? The -MGMTDB itself contains for the moment just the Cluster Health Monitor repository, but expect to see its important increasing with the next versions of Oracle Grid Infrastructure.

If you REALLY want to back it up (even if not fundamental, it is not a bad idea, after all), you can do it.

The -MGMTDB is in noarchivelog by default. You need to either put it in archivelog mode (and set a recovery area, etc etc) or back it up while it is mounted.

Because the Cluster Health Monitor (ora.crf)  depends on it, you have to stop it beforehand:

Then you can operate with -MGMTDB:

Now, imagine that you loose the GRID diskgroup (nowadays, with the ASM Filter Driver, it’s more complex to corrupt a device by mistake, but let’s assume that you do it):

The cluster will not start anymore, you need to disable the crs, reboot and start it in exclusive mode:

 

Then you can recreate the GRID disk group and restore everything inside it:

Finally, the last missing component: the GIMR.

You can recreate it or restore it (if you backed it up at some point in time).

Let’s see how to recreate it:

Conclusion

Recovering from a lost Disk Group / Cluster is not rocket science. Just practice it every now and then. If you do not have a test RAC, you can build your lab on your laptop using the RAC Attack instructions. If you want to test all the scenarios, the RAC SIG webcast: Oracle 11g Clusterware failure scenarios with practical demonstrations by Kamran Agayev is the best starting point, IMHO. Just keep in mind that Flex ASM and the GIMR add more complexity.

HTH

Ludovico

Another successful RAC Attack in Geneva!

ninja-suisseLast week I have hosted the second Swiss RAC Attack workshop at Trivadis offices in Geneva. It has been a great success, with 21 total participants: 5 Ninjas, 4 alumni and 14 people actively installing or playing with RAC 12c on their laptops.

Last year I was suprised by a participant coming fron Nanterre. This year two people came directly from Moscow, just for the workshop!

We’ve got good pizza and special beer: Chimay , Vedett, Duvel, Andechs…

Last but not least, our friend Marc Fielding was visiting Switzerland last week, so he took the opportunity to join us and make the workshop even more interesting! 😀
DSC07173 DSC07164 DSC07183 DSC07148 DSC07147 DSC07144 DSC07142 DSC07154 DSC07153 DSC07152 DSC07151

Looking forward to organize it again in one year! Thank you guys 🙂

Ludovico

How to avoid ORA-02153 when creating database links on 11.2.0.4 (the unsupported way)

Disclaimer (wow, most of my recent posts start with a disclaimer, I guess it’s bad): this post explains an UNSUPPORTED workaround for an error enforced by recent Oracle security checks. You should never use it in production! Forewarned is forearmed.

Before Oracle Database 11.2.0.4, it was possible to create a database link using the following syntax:

It was possible to get the password hash by either selecting dbms_metadata.get_ddl for the database link or by querying directly the link$ table.

Starting with Oracle 11.2.0.4, Oracle is enforcing a check that prevents to use such syntax. Every newly created database link must have the password explicitly set.

This is clearly stated in the MOS note:

ORA-02153: Invalid VALUES Password String When Creating a Database Link Using BY VALUES With Obfuscated Password After Upgrade To 11.2.0.4 (Doc ID 1905221.1)

This is seen as a security enhancement. In my opinion, it forces also to specify clear text passwords somewhere in the scripts that create the db links. (You do not create the db links by hand in sql*plus every time you need one.  Do you?)

The only exception is when using the expdp/impdp. If you expdp a schema, the dumpfile contains the password hash and the statement needed to recreate the database link (… identified by values ‘:1’), but Oracle only allows impdp to use such statement.

So, simple workaround, just create the database links on a dev/staging environment, export them using expdp and then provide your dba the dumpfile so he/she can import it and create the dblinks. Right? Not always.

There is one case where you really need of the old syntax.

  • You don’t know the password

AND

  • You MUST change the database link name.

As you may know, there are no ways to change a database link name (even through impdp, there is no remap_dblink or anything like that).

E.g., you need to keep the db link and intend to use it for a check BUT you want to prevent the application from using it with the old name.

Because I believe that no problems exist that cannot be solved by my Trivadis’ colleagues, I’ve checked internally. A colleague came out with a dead simple (and unsupported) solution:

Insert/update sys.link$, flush the shared_pool.

Remember, use it at your own risk (or don’t use it at all) 😉

HTH

Ludovico