SQL Plan Directives: they’re always good… except when they’re bad!

The new Oracle 12c optimizer adaptive features are just great and work well out of the box in most cases.

Recently, however,  I’ve experienced my very first problem with SQL Plan Directives migrating a database to 12c, so I would like to share it.

Disclaimer 1: this is a specific problem that I found on ONE system. My solution may not fit with your environment, don’t use it if you are not sure about what you’re doing!

Disclaimer 2: despite I had this problem with a single SPD, I like adaptive features and I encourage to use them!!

Problem: a query takes a sub-second in 11gR2, in 12c it takes 12 seconds or more.

V_TAB_PROP is a very simple view. It just selects a central table “TAB” and then takes different properties by joining  a property table “TAB_PROP”.

To do that, it does 11 joins on the same property table.

On the property table, TAB_PROP_ID and PROP_ID are unique (they compose the pk), so nested loops and index unique scans are the best way to get this data.
The table is 1500Mb big and the index 1000Mb.

This was the plan in 11g:

In 12c, the plan switches to adaptive, and half of the joins are converted to hash joins / full table scans:

However, the inflection point is never reached. The execution keeps the default plan that has half of the joins HJ and the other half NL.

The problem in this case is the SQL Directive. Why?

There are to many distinct values for TAB_ID and the data is very skewed.

The histogram on that column is OK and it always leads to the correct plan (with the adaptive features disabled).
But there are still some “minor” misestimates, and the optimizer sometimes decides to create a SQL Plan directive:

The Directive instructs the optimizer to do a dynamic sampling, but with a such big and skewed table this is not ok, so the Dynamic sampling result is worse than using the histogram. I can check it by simplifying the query to just one join:

What’s the fix?

I’ve tried to drop the directive first, but it reappears as soon as there are new misestimates.
The best solution in my case has been to disable the directive, an operation that can be done easily with the DBMS_SPD package:

I did this on a QAS environment.
Because the production system is not migrated to 12c yet, it’s wise to import these disabled directives in production before the optimizer creates and enables them.

Off course, the directives can’t be created for objects that do not exist, the import  has to be done after the objects migrate to the 12c version.

Because the SQL Plan Directives are tied to specific objects and not specific queries, they can fix many statements at once, but in case like this one, they can compromise several statements!

Monitoring the creation of new directives is an important task as it may indicate misestimates/lack of statistics on one side or execution plan changes on the other one.

Oracle RAC and the Private Cloud. And why small customers are not implementing it. Not yet.

Cloud. What a wonderful word. Wonderful and gray.
If you are involved in the Oracle Community, blogs and conferences, you certainly care about it and have perhaps your own conception of it or ideas about how to implement it.

My Collaborate 2015 RAC SIG experience

During the last Collaborate Conference, I’ve “tried” to animate the traditional RAC SIG Round-Table  with this topic:

In the last few years, cloud computing and infrastructure optimization have been the leading topics that guided the IT innovation. What’s the role of Oracle RAC in this context?

During this meeting leading RAC specialists, product managers, RAC SIG representatives and RAC Attack Ninjas will come together and discuss with you about the new Oracle RAC 12c features for the private cloud and the manageability of RAC environments.

Join us for the great discussion. This is your chance to have a great networking session!

Because it’s the RAC SIG meeting, most of the participants DO HAVE a RAC environment to manage, and are looking for best practices and ideas to improve it, or maybe they want to share their experiences.

I’ve started the session by asking how many people are currently operating a private cloud and how many would like to implement it.

With my biggest surprise (so big that I felt immediately uncomfortable), except one single person, nobody raised the hand.

What?

I’ve spent a very bad minute, I was almost speechless. I was actually asking myself: “is my conception of private cloud wrong?”. Then my good friend Yury came in help and we started the discussion about the RAC features that enable private cloud capabilities. During those 30 minutes, almost no users intervened. Then Oracle Product Managers (RAC, ASM, QoS, Cloud) started explaining their point of view, and I suddenly realized that

when talking about Private Cloud, there is a huge gap between the Oracle Private Cloud implementation best practices and the small customers skills and budgets.

When Oracle product managers talk about Private Cloud, they target big companies and advice to plan the infrastructure using:

  • Exadata
  • Full-pack of options for a total of 131k per CPU:
    • Enterprise Edition (47.5k)
    • Multitenant (17.5k)
    • Real Application Clusters (23k)
    • Partitioning (11.5k)
    • Diagnostic Pack (7.5k)
    • Tuning Pack (5k)
    • Lifecycle Management Pack (12k)
    • Cloud Management Pack (7.5k)
  • Flex Cluster
  • Policy Managed Databases
  • Quality of Services Management
  • Rapid Home provisioning
  • Enterprise Manager and DBaaS Self Service portal

The CapEx needed for such a stack is definitely a show stopper for most small-medium companies. And it’s not only about the cost. When I gave my presentation about Policy Managed Databases at Collaborate in 2014, and later about Multitenant and MAA at Open World, it was clear that “almost” nobody (let’s say less than 5%, just to give an idea) uses these new technologies. Many of them are new and, in some cases, not stable. Notably, Multitenant and QoS are not working together as of now. Qos will work with the new resource manager at PDB level only in release 12.2 (and still not guaranteed).

For the average company (or the average DBA), there is more than enough to be scared about, so private cloud is not seen as easy to implement.

So there’s no private cloud solution for SMBs?

It really depends on what you want to achieve, and at which level.

Based on my experience at Trivadis, I can say that you can achieve Private Cloud for less. Much less.

What a Private Cloud should guarantee? According to its NIST definition, five four things:

  1. On-demand self-service.
  2. Broad network access.
  3. Resource pooling.
  4. Rapid elasticity.
  5. Measured service.

Number 5 is a clear field of EM, and AWR Warehouse new feature may be of great help, for free  (but still, you can do a lot on your own with Statspack and some scripting if you are crazy enough to do it without Diagnostic pack).

Numbers 3 and 4 are a peculiarity of RAC, and they are included in the EE+RAC license. By leveraging OVM, there are very good opportunities of savings if the initial sizing of the solution is a problem. With OVM you can start as small as you want.

Number 1 depends on standards and automation already in place at your company. Generally speaking, nowadays scripting automatic provisioning with DBCA and APEX is very simple. If you’re not comfortable with coding, tools like the Trivadis Toolbox make this task easier. Moreover, nobody said that the self-service provisioning must be done through a web interface by the final user. It might be (and usually is) triggered by an event, like the creation of a service request, so you can keep web development outside of your cloud.

Putting all together

You can create a basic Private Cloud that fits perfectly your needs without spending or changing too much in your RAC environment.

Automation doesn’t mean cost, you can do it on your own and keep it simple. If you need an advice, ideas or some help, just drop me an email (myfirstname.mylastname@trivadis.com), it would be great to discuss about your need for private cloud!

Things can be less complex than what we often think. Our imagination is the new limit!

Ludovico

#C15LV RAC Attack wrap

Did I say in some of my previous posts that I love RAC Attack? I love it more during Collaborate conference, because doing it as a pre-conference workshop is just the right way the get people involved and go straight to the goal: learning while having fun together.

We had way less people than expected but it still has been a great success!

The t-shirts have been great for coloring the room: as soon as people finished the installation of the first Linux VM, they’ve got one t-shirt.
DSC04023

Look at the room at the beginning of the workshop:

DSC04034

 

after a few hours, the room looked better! New ninjas, red stack, happy participants :-)

DSC04118

We had a very special guest today. Mr. RAC PM in person has tried and validated our installation instructions 😉
DSC04049

We got pizza again, but because of restrictions at the convention center, it has been a beer-free afternoon :-(

Thank you anyway to the OTN for sponsoring almost everything!!

DSC04100

 

Looking forward to organize the next RAC Attack, Thank you guys!! :-)

DSC04106
DSC04134

CCasAEUUEAA7bvL

 

 

 

Ludo

 

It’s time to Collaborate again!!

Collaborate15_Horizontal_LogoIn a little more than a couple of weeks, the great Collaborate conference will start again.

My agenda will be quite packed again, as speaker, panelist and workshop organizer:

Date/Time Event
08/04/2015
3:15 pm - 4:15 pm
Oracle RAC, Data Guard, and Pluggable Databases: When MAA Meets Oracle Multitenant
IOUG Collaborate 15, Las Vegas NV
08/04/2015
4:30 pm - 5:30 pm
Panel: Nothing to BLOG About - Think Again
IOUG Collaborate 15, Las Vegas NV
12/04/2015
9:00 am - 4:00 pm
RAC Attack! 12c
IOUG Collaborate 15, Las Vegas NV
15/04/2015
5:30 pm - 6:00 pm
IOUG RAC SIG Meeting
IOUG Collaborate 15, Las Vegas NV

 

RAC Attack! 12c

This technical workshop and networking event (never forget it’s a project created several years ago thanks to an intuition of Jeremy Schneider), confirms to be one of the best, long-living projects in the Oracle Community. It certainly boosted my Community involvement up to becoming an Oracle ACE. This year I’m the coordinator of the organization of the workshop, it’s a double satisfaction and it will certainly be a lot of fun again. Did I said that it’s already full booked? I’ve already blogged about it (and about what the lucky participants will get) here.

 

Oracle RAC, Data Guard, and Pluggable Databases: When MAA Meets Oracle Multitenant 

One of my favorite presentations, I’ve presented it already at OOW14 and UKOUG Tech14, but it’s still a very new topic for most people, even the most experienced DBAs. You’ll learn how Multitenant, RAC and Data Guard work together. Expect colorful architecture schemas and a live demo!  You can read more about it in this post.

 

Panel: Nothing to BLOG About – Think Again

My friend Michael Abbey (Pythian) invited me to participate in his panel about blogging. It’s my first time as panelist, so I’m very excited!

 

IOUG RAC SIG Meeting

Missing this great networking event is not an option! I’m organizing this session as RAC SIG board member (Thanks to the IOUG for this opportunity!). We’ll focus on Real Application Clusters role in the private cloud and infrastructure optimization. We’ll have many special guests, including Oracle RAC PM Markus Michalewicz, Oracle QoS PM Mark Scardina and Oracle ASM PM James Williams.

Can you ever miss it???

 

A good Trivadis representative!!

trivadis.com

This year I’m not going to Las Vegas alone. My Trivadis colleague Markus Flechtner , one of the most expert RAC technologists I have the chance to know, will also come and present a session about RAC diagnostics:

615: RAC Clinics- Starring Dr. ORACHK, Dr CHM and Dr. TFA

Mon. April 13| 9:15 AM – 10:15 AM | Room Palm D

If you speak German you can follow his nice blog: http://oracle.markusflechtner.de/

Looking forward to meet you there

Ludovico

Moving Clusterware Interconnect from single NIC/Bond to HAIP

Very recently I had to configure a customer’s RAC private interconnect from bonding to HAIP to get benefit of both NICs.

So I would like to recap here what the hypothetic steps would be if you need to do the same.

In this example I’ll switch from a single-NIC interconnect (eth1) rather than from a bond configuration, so if you are familiar with the RAC Attack! environment you can try to put everything in place on your own.

First, you need to plan the new network configuration in advance, keeping in mind that there are a couple of important restrictions:

  1. Your interconnect interface naming must be uniform on all nodes in the cluster. The interconnect uses the interface name in its configuration and it doesn’t support different names on different hosts
  2. You must bind the different private interconnect interfaces in different subnets (see Note: 1481481.1 – 11gR2 CSS Terminates/Node Eviction After Unplugging one Network Cable in Redundant Interconnect Environment if you need an explanation)

 

Implementation 

The RAC Attack book uses one interface per node for the interconnect (eth1, using network 172.16.100.0)

To make things a little more complex, we’ll not use the eth1 in the new HAIP configuration, so we’ll test also the deletion of the old interface.

What you need to do is add two new interfaces (host only in your virtualbox) and configure them as eth2 and eth3, e.g. in networks 172.16.101.0 and 172.16.102.0)

 

modify /var/named/racattack in order to use the new addresses (RAC doesn’t care about logical names, it’s just for our convenience):

add also the reverse lookup in  in-addr.arpa:

 

restart  named on the first node and check that both nodes can ping all the names correctly:

check the nodes that compose the cluster:

on all nodes, make a copy of the gpnp profile.xml (just in case, the oifcfg tool does the copy automatically)

List the available networks:

Get the current ip configuration for the interconnect:

one one node only, set the new interconnect interfaces:

check that the other nodes has received the new configuration:

Before deleting the old interface, it would be sensible to stop your cluster resources (in some cases, one of the nodes may be evicted), in any case the cluster must be restarted completely in order to get the new interfaces working.

Note: having three interfaces in a HAIP interconnect is perfectly working, HAIP works from 2 to 4 interfaces. I’m showing how to delete eth1 just for information!! :-)

on all nodes, shutdown the CRS:

Now you can disable the old interface:

and modify the parameter ONBOOT=no inside the configuration script of eth1 interface.

Start the cluster again:

And check that the resources are up & running:

 

 Testing the high availability

Disconnect cable from one of the two interfaces (virtually if you’re in virtualbox :-) )

Pay attention at the NO-CARRIER status (in eth2 in this example):

check that the CRS is still up & running:

 

The virtual interface eth2:1 as failed over on the second interface as eth3:2

 

After the cable is reconnected, the virtual interface is back on eth2:

 

Further information

For this post I’ve used a RAC version 11.2, but RAC 12c use the very same procedure.

You can discover more here about HAIP:

http://docs.oracle.com/cd/E11882_01/server.112/e10803/config_cw.htm#HABPT5279 

And here about how to set it (beside this post!):

https://docs.oracle.com/cd/E11882_01/rac.112/e41959/admin.htm#CWADD90980

https://docs.oracle.com/cd/E11882_01/rac.112/e41959/oifcfg.htm#BCGGEFEI

 

Cheers

Ludo

RAC Attack at IOUG Collaborate 2015

Once again this year the RAC Attack will be a pre-conference workshop at Collaborate.

Whether you’re a sysadmin, a developer or a DBA, I’m sure you will really enjoy this workshop. Why?

First, you get the opportunity to install a RAC 12c using Virtualbox on your laptop and get coached by many RAC experts, Oracle ACEs and ACE Directors, OCMs and famous bloggers and technologists.

If you’ve never installed it, it will be very challenging because you get hands on network components, shared disks, udev, DNS, Virtual Machine cloning, OS install and so on, and being super-user (root) of your own cluster!! If your a developer, you can then start developing your applications by testing the failover features of RAC and their scalability by checking for global cache wait events.

If you’re already used to RAC, this year we have not one or two, but three deals for you:

  1. Try the semi-automated RAC installation using Vagrant: you’ll be able to have your RAC up and running in minutes and concentrate on advanced features.
  2. Implement advanced labs such as Flex Cluster and Flex ASM or Policy Managed Databases, and discover Hub and Leaf nodes, Server Pools and other features
  3. Ask the ninjas to show you other advanced scenarios or just discuss about other RAC related topics

 

Isn’t enough?

The participants that will complete at least the Linux install (very first stage of the workshop) will get an OTN-sponsored T-shirt of the event, with the very new RAC SIG Logo (the image is purely indicative, the actual design may change):

t-shirt-c15lv

 

Still not enough?

We’ll have free pizza (at lunch) and beer (in the afternoon), again sponsored by the Oracle Technology Network. Can’t believe it? Look at a few images from last year’s edition:

20140407_121101


IMG_0210

BkpvaPBCAAAULIB

RACAttackC14LV

Check the pre-conference workshops on the IOUG Collaborate 15 website and don’t forget to full-fill the requirements before attending the workshop:

To participate in the workshop, participants need to bring their own laptop. Recommended specification: a) any 64 bit OS that supports Oracle Virtual Box b) 8GB RAM, 45GB free HDD space, SSD recommended.

Important: it’s required to pre-download Oracle Database 12c and Oracle Grid Infrastructure 12c for Linux x86-64 from the Oracle Website http://tinyurl.com/rac12c-dl (four files: linuxamd64_12c_database_1of2.zip linuxamd64_12c_database_2of2.zip linuxamd64_12c_grid_1of2.zip linuxamd64_12c_grid_2of2.zip). Due to license restrictions it’s not be possible to distribute Oracle Sofware.

Looking forward to meet you there!!!

— 

Ludovico

Get the last database backup for all databases in a SQL Server instance

I don’t like to publish small code snippets, but I’ve just rewritten one of my most used SQL scripts for SQL Server that gets the details about the last backup for every database (for both backup database and backup log).

It now makes use of with () and rank() over() to make it much easier to read and modify.

So I think it’s worth to share it.

As you can see, modify it to include for example incremental backups should be very easy.

Cheers

Ludo

My Collaborate 14 articles about Active Data Guard 12c and Policy Managed Databases

After almost 1 year, I’ve decided to publish these articles on my Slideshare account. You may have already seen them in the IOUG Collaborate 14 conference content or in the SOUG Newsletter 2014/4. Nothing really new, but I hope you’ll still enjoy them.


Cheers

Ludo

A 2014 of technology, sharing, friends, fun. And ideas for the next year.

2014 has been a great year, and it’s time to summarize it quickly.

A few numbers

  • 24 blog posts
  • ~37000 page views, >29000 visits
  • Speaker at 3 major conferences (#C14LV, #OOW14, #UKOUG_TECH14)
  • Speaker at 2  Trivadis internal conferences
  • Speaker at 2 local user group events
  • Speaker at an Oracle Business Breakfast
  • A total of 14 public speeches
  • I’ve been included in the ACE Program
  • 3 RAC Attack workshops (one co-organized and another self-organized)
  • 1 roundtable as co-organizer
  • 2 T-shirt designed as gifts for 2 RAC Attack workshops
  • 3 articles published
  • 3 new websites for the community (The new @IT_OUG website and two still in progress)
  • Joined 3 user group boards (SOUG-R, ITOUG and RAC SIG (well, RAC SIG and ITOUG were actually in 2013…))
  • Countless new friends and/or contacts

 

UKOUG Tech 14 wrap-up

I’ve attended UKOUG Tech for the first time this year and have not found the time to blog about it. The conference is very good. There is a lot of good content, the agenda is great. But I’ve heard that the conference was better and bigger in the past, that’s why the UKOUG is going to move it back to Birmingham in 2015. I sincerely hope that they will get back the attendance this conference deserves.

Sadly, the RAC Attack has not been organized and advertised properly, so it has been a failure comparing to te previous ones I’ve attended/organized.

I have learnt many new technical things and I also have learned some lessons:

  • Never under-estimate the time it takes to say goodbye to everyone before running to the train station if you’re in a rush for the train to the airport
  • Even if the temperature is higher than in Switzerland, the wind-chill in Liverpool may get it seems much colder
  • Be careful when English people invite you for a quick beer, it’s not one beer and it’s not quick :-) (especially when you go back home at 2AM and pretend to be awake at 6AM, check-out at the hotel and prepare the demo for the first presentation in the morning)

DSC_0391 DSC_0393 DSC_0407 DSC_0367 DSC_0374 DSC_0382 DSC_0386

DSC_0363[1]

 

B4fAw_iCYAAJ8wB

 

Once again, I’ve met many new friends :-)

 

6 months in the ACE program

I’ve spent my first 6 months in the ACE Program, it would take too much to talk about it. The best I’ve taken is not being in the program by itself. The best is the effort, the feedback and the new connections I’ve got as an active member of the community. Is a path that I recommend to all my Oracle-involved friends and colleagues because it’s very rewarding. I hope to be able to blog more about it in 2015 😉

 

A couple of pictures from 2014…

10532989_10204805457406497_9118297674875147610_n By-X9wECEAAYiFb DSC02619 1781065_612698348817395_3758740240585039433_o BlJFNzQCAAA4Ikp BkpvaPBCAAAULIB

 


B1vurMxIEAAXgRg

Projects for 2015

  • My fligths are already booked for the next Collaborate 15 #C15LV. See you in Vegas :-)
  • About the ITOUG, the die is cast, I’ll certainly contribute more in Italian with articles, webcasts and general duties related to the OUG administration. Keep an eye on www.itoug.it
  • Other secret plans that I cannot unveil now 😉

Happy new year to everyone!

Ludo

Cloning a PDB with ASM and Data Guard (no ADG) without network transfer

Ok, if you’re reading this post, you may want to read also the previous one that explains something more about the problem.

Briefly said, if you have a CDB running on ASM in a MAA architecture and you do not have Active Data Guard, when you clone a PDB you have to “copy” the datafiles somehow on the standby. The only solution offered by Oracle (in a MOS Note, not in the documentation) is to restore the PDB from the primary to the standby site, thus transferring it over the network. But if you have a huge PDB this is a bad solution because it impacts your network connectivity. (Note: ending up with a huge PDB IMHO can only be caused by bad consolidation. I do not recommend to consolidate huge databases on Multitenant).

So I’ve worked out another solution, that still has many defects and is almost not viable, but it’s technically interesting because it permits to discover a little more about Multitenant and Data Guard.

The three options

At the primary site, the process is always the same: Oracle copies the datafiles of the source, and it modifies the headers so that they can be used by the new PDB (so it changes CON_ID, DBID, FILE#, and so on).

On the standby site, by opposite, it changes depending on the option you choose:

Option 1: Active Data Guard

If you have ADG, the ADG itself will take care of copying the datafile on the standby site, from the source standby pdb to the destination standby pdb. Once the copy is done, the MRP0 will continue the recovery. The modification of the header block of the destination PDB is done by the MRP0 immediately after the copy (at least this is what I understand).

ADG_PDB_copy

Option 2: No Active Data Guard, but STANDBYS=none

In this case, the copy on the standby site doesn’t happen, and the recovery process just add the entry of the new datafiles in the controlfile, with status OFFLINE and name UNKNOWNxxx.  However, the source file cannot be copied anymore, because the MRP0 process will expect to have a copy of the destination datafile, not the source datafile. Also, any tentative of restore of the datafile 28 (in this example) will give an error because it does not belong to the destination PDB. So the only chance is to restore the destination PDB from the primary.
NOADG_PDB_STANDBYS_NONE_copy

Option 3: No Active Data Guard, no STANDBYS=none

This is the case that I want to explain actually. Without the flag STANDBYS=none, the MRP0 process will expect to change the header of the new datafile, but because the file does not exist yet, the recovery process dies.
We can then copy it manually from the source standby pdb, and restart the recovery process, that will change the header. This process needs to be repeated for each datafile. (that’s why it’s not a viable solution, right now).

NOADG_PDB_copy

Let’s try it together:

The Environment

Primary

Standby

The current user PDB (any resemblance to real people is purely coincidental 😉 #haveUSeenMaaz):

Cloning the PDB on the primary

First, make sure that the source PDB is open read-only

Then, clone the PDB on the primary without the clause STANDBYS=NONE:

Review the clone on the Standby

At this point, on the standby the alert log show that the SYSTEM datafile is missing, and the recovery process stops.

One remarkable thing, is that in the standby controlfile, ONLY THE SYSTEM DATAFILE exists:

We need to fix the datafiles one by one, but most of the steps can be done once for all the datafiles.

Copy the source PDB from the standby

What do we need to do? Well, the recovery process is stopped, so we can safely copy the datafiles of  the source PDB from the standby site because they have not moved yet. (meanwhile, we can put the primary source PDB back in read-write mode).

Copy the datafiles:

Do the magic

Now there’s the interesting part: we need to assign the datafile copies of the maaz PDB to LUDO.

Sadly, the OMF will create the copies on the bad location (it’s a copy, to they are created on the same location as the source PDB).

We cannot try to uncatalog and recatalog the copies, because they will ALWAYS be affected to the source PDB. Neither we can use RMAN because it will never associate the datafile copies to the new PDB. We need to rename the files manually.

It’s better to uncatalog the datafile copies before, so we keep the catalog clean:

Then, because we cannot rename files on a standby database with standby file management set to AUTO, we need to put it temporarily to MANUAL.

standby_file_management is not PDB modifiable, so we need to do it for the whole CDB.

then we need to set back the standby_file_management=auto or the recover will not start:

We can now restart the recovery.

The recovery process will:
– change the new datafile by modifying the header for the new PDB
– create the entry for the second datafile in the controlfile
– crash again because the datafile is missing

We already have the SYSAUX datafile, right? So we can alter the name again:

This time all the datafiles have been copied (no user datafile for this example) and the recovery process will continue!! :-) so we can hit ^C and start it in background.

The Data Guard configuration reflects the success of this operation.

Do we miss anything?

Of course, we do!! The datafile names of the new PDB reside in the wrong ASM path. We need to fix them!

 

I know there’s no practical use of this procedure, but it helps a lot in understanding how Multitenant has been implemented.

I expect some improvements in 12.2!!

Cheers

Ludo